posted by

**atreic**at 12:01pm on 19/11/2013A-level probability question, to bring joy to children at Christmas:

Tesco have an offer on Octonauts figures at the moment (that's true, so if you were hoping to buy some, now you know). The parents of an adorable 2 year old want 5 out of the 8 characters (he owns 3 already). However, the figures are not listed separately, just as 'one supplied.'

How many should they order into store to have a good chance (say >90%) of getting the 5 they want out of the random selection that Tescos send? There's no issue with taking any surplus back, so they could order vast quantities... but that would seem a little absurd.

[I can see 3 similar problems. (a) the simple problem, where 'Tescos' is a box containing all 8 characters, once and once only (b) a finite problem, where Tescos is a finite but large box (say containing 64 octonauts, 8 of each) and (c) an infinite problem, where Tescos is an Octonaut producing machine that can produce as many Octonauts as are ordered and dispenses them at random. I think the actual problem is (c), but I'm quite interested in the solutions to (a) and (b) as well and how it makes the maths different...]

My vague thoughts...

The simple problem would be easy. If you only bought 7 octonauts, there'd be a 5/8 chance that the last octonaut was in the set 'octonauts I wanted' and a 3/8 chance it wasn't. So that's only a 37% chance of a happy toddler... so you buy all 8 and have 100% chance. (is that right? Have I done something wrong?)

I wondered if it's best tackled as 1 - the probability you don't get all the figures you want. I thought briefly this was 1 - (3/8)^n, where n is the number of figures you buy, but actually this is Wrong, because that is the probability you get at least one figure you want, not that you get all five. And you must have to do something a bit like permutations and combinations, because four Pesos is different to a Peso, a Kwazii, a Barnacles, and a Tweak, even if in both situations you have 'four things you wanted'

So the first octonaut comes out of the machine. You have a 5/8 chance it's one you want, and a 3/8 chance it isn't. Gah, I'm going to draw a huge probability tree, which is hard in text. If you got the one you wanted, you now have a 4/8 chance of getting another one you want, and a 4/8 chance of not doing so.

*draws tree diagram up to three Octonauts*

So the number of octonauts at each node of this tree is a binomial distribution ( minus 1). That is, after one octonaut, the nodes are 'have 1 wanted octonaut' or 'have 0 wanted octonauts' And after two octonauts, the nodes are 2, 1, 1, 0. And after 3 it's 3,2,2,1,2,1,1,0 = 1*3,3*2,3*1,1*0.

Oh, this is coin tosses! Every time you pull the octanaut lever it's like tossing a coin, and heads is 'octonaut I want' and tails is 'octonaut I didn't want'. EXCEPT it's a biased coin (the first time you have 5/8ths chance of wanted Octonaut, rather than 1/2) and EXCEPT the bias of the coin changes based on what you got... because if you already have one, you have 4/8th chance of a wanted Octonaut, and if you already have two you have a 3/8ths chance... So, err, not much like coin tosses.

So it doesn't matter how you got to the 'I have two octonauts I want' node, any future path from their is the same. But there are lots of different ways of getting to (eg) the 'I have two octonauts I want' node and their probabilities aren't the same - eg after 3 Octonauts, 1,1,0 is more likely (P = 0.20ish) than 0,1,1 (P = 0.12 ish). Which makes sense, because you are more likely to get an Octonaut you want when you don't have any octonauts you want at all.

Hmm. I could solve it by drawing a Huge Diagram, but I'd get something wrong. My brain is saying 'simulation!' but a) that's not Real Maths, and b) I don't know how to do it in a quick and dirty way. Not in Excel, I guess... ;-)

Any help?

ETA: I _think_ I've solved this now, although in a very cludgy way!

See comments, especially

http://atreic.livejournal.com/506283.html?thread=7797931#t7797931

Tesco have an offer on Octonauts figures at the moment (that's true, so if you were hoping to buy some, now you know). The parents of an adorable 2 year old want 5 out of the 8 characters (he owns 3 already). However, the figures are not listed separately, just as 'one supplied.'

How many should they order into store to have a good chance (say >90%) of getting the 5 they want out of the random selection that Tescos send? There's no issue with taking any surplus back, so they could order vast quantities... but that would seem a little absurd.

[I can see 3 similar problems. (a) the simple problem, where 'Tescos' is a box containing all 8 characters, once and once only (b) a finite problem, where Tescos is a finite but large box (say containing 64 octonauts, 8 of each) and (c) an infinite problem, where Tescos is an Octonaut producing machine that can produce as many Octonauts as are ordered and dispenses them at random. I think the actual problem is (c), but I'm quite interested in the solutions to (a) and (b) as well and how it makes the maths different...]

My vague thoughts...

The simple problem would be easy. If you only bought 7 octonauts, there'd be a 5/8 chance that the last octonaut was in the set 'octonauts I wanted' and a 3/8 chance it wasn't. So that's only a 37% chance of a happy toddler... so you buy all 8 and have 100% chance. (is that right? Have I done something wrong?)

I wondered if it's best tackled as 1 - the probability you don't get all the figures you want. I thought briefly this was 1 - (3/8)^n, where n is the number of figures you buy, but actually this is Wrong, because that is the probability you get at least one figure you want, not that you get all five. And you must have to do something a bit like permutations and combinations, because four Pesos is different to a Peso, a Kwazii, a Barnacles, and a Tweak, even if in both situations you have 'four things you wanted'

So the first octonaut comes out of the machine. You have a 5/8 chance it's one you want, and a 3/8 chance it isn't. Gah, I'm going to draw a huge probability tree, which is hard in text. If you got the one you wanted, you now have a 4/8 chance of getting another one you want, and a 4/8 chance of not doing so.

*draws tree diagram up to three Octonauts*

So the number of octonauts at each node of this tree is a binomial distribution ( minus 1). That is, after one octonaut, the nodes are 'have 1 wanted octonaut' or 'have 0 wanted octonauts' And after two octonauts, the nodes are 2, 1, 1, 0. And after 3 it's 3,2,2,1,2,1,1,0 = 1*3,3*2,3*1,1*0.

Oh, this is coin tosses! Every time you pull the octanaut lever it's like tossing a coin, and heads is 'octonaut I want' and tails is 'octonaut I didn't want'. EXCEPT it's a biased coin (the first time you have 5/8ths chance of wanted Octonaut, rather than 1/2) and EXCEPT the bias of the coin changes based on what you got... because if you already have one, you have 4/8th chance of a wanted Octonaut, and if you already have two you have a 3/8ths chance... So, err, not much like coin tosses.

So it doesn't matter how you got to the 'I have two octonauts I want' node, any future path from their is the same. But there are lots of different ways of getting to (eg) the 'I have two octonauts I want' node and their probabilities aren't the same - eg after 3 Octonauts, 1,1,0 is more likely (P = 0.20ish) than 0,1,1 (P = 0.12 ish). Which makes sense, because you are more likely to get an Octonaut you want when you don't have any octonauts you want at all.

Hmm. I could solve it by drawing a Huge Diagram, but I'd get something wrong. My brain is saying 'simulation!' but a) that's not Real Maths, and b) I don't know how to do it in a quick and dirty way. Not in Excel, I guess... ;-)

Any help?

ETA: I _think_ I've solved this now, although in a very cludgy way!

See comments, especially

http://atreic.livejournal.com/506283.

## (no subject)

atreic.livejournal.comat 12:13pm on 19/11/2013I wonder how many facebook friends you need so that you are more like than not to have a friend with a birthday on every single day of the year?

This was solved neatly by Ben:

Under (presumably) the same assumptions as above (uniform distribution of birthdays, ignoring leap years):

Define P(N,D) to be the probability that N people have exactly D different birthdays. Then we want to find the smallest N* such that P(N*,365)>0.5.

I'm thinking induction is the way to go. Suppose we have P(N-1,d) for all d, i.e. we know the chance of N-1 people having exactly d birthdays (for d = 1, 2, ... N-1).

Then to get P(N,D) there are two ways:

1) The N-1 people had D-1 different birthdays and the Nth person has a different birthday. The chance of this is (365-(D-1))/365 = (366-D)/365.

2) The N-1 people had D different birthdays and the Nth person has the same birthday as one of them. The chance of this is D/365.

So the recursion is

P(N,D) = [ P(N-1,D) * D + P(N-1,D-1)*(366-D) ]/365

with initial conditions P(1,1) = 1 and P(1,D) = 0 for D>1.

You can probably solve this with generating functions, but to be honest I don't care that much! A simple bit of code gives N* = 2287. This is pretty close to Vincent's answer. I wonder what happens if Vincent computes the median, and not the mean? Of course, since I said the code was simple, it's also quite possible that I made a mistake!

***

I think this is probably the solution for this... define P (N,X) as the probability after buying N Octonauts you have X Octonauts that you want? (I did this as P(N,O) with O for Octonaut initially, but it's _awful_ notation, because O looks like 0) Then we could have the recursion as P (N,X) = P(N-1,X) * P(didn't get an octonaut you want, given you already have X) + P (N-1,X-1)* P (did get an octonaut you want, given you already have X-1)

P(didn't get an octonaut you want, given you already have X) is easy, it's (X+3)/8 (so as soon as you have all 5, you never get an octonaut you want). Likewise P (did get an octonaut you want, given you already have X-1) is (6-X)/8

But I don't know how to actually do the recursion and calculate the answer...

Edited Date: 2013-11-19 12:16 pm (UTC)## (no subject)

atreic.livejournal.comat 12:17pm on 19/11/2013## (no subject)

atreic.livejournal.comat 12:22pm on 19/11/2013## (no subject)

atreic.livejournal.comat 01:44pm on 19/11/2013P(10,5) is a miserable 16% (although you'll have a 44% chance of having that oh-so-frustrating 4)

To actually get P(N,5) > 0.90, you need to order 30 Octonauts.

On the other hand, a pile of 30 Octonauts would be great :-)

## (no subject)

cartesiandaemon.livejournal.comat 02:45pm on 19/11/2013## (no subject)

sain-bano.livejournal.comat 02:51pm on 19/11/2013I couldn't quite bring myself to order in 30 sets of figures, so have ordered 10 and will report back on Thursday when I pick them up about what I ended up with. My disappointment at the dismal success rate predicted is improved slightly by the thought that once I've bought the figures at offer price, I can then order more in if needed and exchange even after the offer has finished. I'm sure all this would be easier if any of the Tescos in a 20 mile radius actually stocked the darned things.

## (no subject)

atreic.livejournal.comat 02:58pm on 19/11/2013knowing how lucky you were to manage it! :-)We have a big tescos with lots of toys that is our local, so if you do end up trying to hunt down Just One, let me know... I haven't specifically looked for Octonauts though.

## (no subject)

3c66b.livejournal.comat 01:50pm on 19/11/2013When you've bought 0 Octonauts, the expected number that you've got that you want is 3.

When you've bought 1, it's 3 + 5/8

When you've bought 2, it's 3 + 5/8 + (4 3/8)/8

When you've bought 3 ... and then your criterion for stopping is something like an expectation value of 7.2. Maybe.

## (no subject)

atreic.livejournal.comat 12:26pm on 19/11/2013It sounds like a variant on the coupon collector's problem: http://en.wikipedia.org/wiki/

## (no subject)

cartesiandaemon.livejournal.comat 01:02pm on 19/11/2013An alternative way of putting it would be to say "I'm trying to collect N coupons, and every box I order has a 5/8 chance of containing a coupon". I'm not sure if that would be any easier to solve.

## (no subject)

atreic.livejournal.comat 01:04pm on 19/11/2013## (no subject)

cartesiandaemon.livejournal.comat 01:09pm on 19/11/2013That's what I thought at first, but it seems statistics is full of things like the birthday paradox where it's not _that_ complicated, but "finding the proper name for it on wikipedia" is the only successful strategy to actually finding the solution, which is what someone did.

Addendum: Scratch my previous suggestion, all the formulas in wikipedia are expressed in terms of a sum from 1 to 8 of the time needed to get the N'th coupon, so you ought to be able to just drop the first few terms of the series. And it's short enough you might just be able to calculate a numerical value for 8... (Although the "how many to have a 90% chance of getting all" is a bit more complicated than "expect time to get all")

## (no subject)

simontat 09:34am on 20/11/2013If there's only one Octonaut you need, that's reasonably easy: ordering n Octonauts and finding that not one of them is the kind you need is equivalent to saying that all of them are chosen from the remaining 7 possibilities, so the chance of that happening is (7/8)

^{n}. Hence, getting at least one of the kind you're looking for will happen with probability (1 - (7/8)^{n}).And if you wanted

at least oneof the five kinds of Octonaut, the same reasoning would apply: you'll get at least one of them with probability (1 - (3/8)^{n}), because the only way youcan'tis if all n of them are one of the 3 kinds you've got already.(You've gone through that reasoning yourself in your original post, I know. I'm just going through it again myself as a warm-up exercise to get my head in gear.)

But if you want

allof the five kinds to show up at once, that's more fiddly. I suspect an Inclusion-Exclusion Principle exercise will be required.Start with probability 1 (all possibilities).

Subtract five lots of (7/8)

^{n}, indicating the five subsets of possibilities in which one or other of the Octonauts you want fail to show up.Now we've double-subtracted every situation in which

twodesired Octonauts fail to show up. There are (5 choose 2) = 10 of those and each one has probability (6/8)^{n}, so add 10 × (6/8)^{n}back in.But now we've got an extra copy of each case in which

threefail to show up, so now subtract again, and so on.So we end up with the probability of getting at least one of each of the five required Octonauts being

P = 1 - 5 × (7/8)

^{n}+ 10 × (6/8)^{n}- 10 × (5/8)^{n}+ 5 × (4/8)^{n}- 1 × (3/8)^{n}And that formula gives me 0.1616 for n=10, and 0.9107 for n=30, so it agrees with your Excel solution.

I suspect there's no nice formula to invert this (i.e. to give you n when you put in P), but it's an analytic solution to half of it at least.

## (no subject)

atreic.livejournal.comat 12:15pm on 20/11/2013"I was briefly worried that my calculation didn't _exactly_ match

Pete's numerical Perl solution, until I realised he'd done it in a

Monte Carlo way so some deviation from the true figures is to be

expected."

## (no subject)

atreic.livejournal.comat 12:30pm on 20/11/2013> and why it's not exact?

What I meant by a Monte Carlo approach is a random approach. Pete's

program actually modelled the problem as if for real - it chose a

bunch of Octonauts _at random_, checked whether they included one of

each of the five desired kinds, and then did that over and over again

in a loop and tracked what proportion of the trials were successful.

So it's inexact simply because it depends on randomness - if he'd been

really unlucky, his program might have just happened to get one of

every kind of Octonaut in every trial by sheer luck, or get none of

them on any occasion. The best you can say about the accuracy of the

Monte Carlo approach is that there's a _high probability_ of the

experimental result being near to the exact value, and it clusters

more closely the more trials you do.

(But its great virtue is that if you can live with that uncertainty in

the output, it's able to model problems of arbitrary complexity, long

after it becomes intractable to do the maths analytically or to

iterate over absolutely all possibilities!)

## (no subject)

atreic.livejournal.comat 12:31pm on 20/11/2013Oh, of course. He rolled lots of dice, and although that's a good way of

working out probability, it's not a great way, because you could just be

really lucky at rolling dice.

## (no subject)

pete stevensat 02:17pm on 20/11/2013But on the being lucky/unlucky. I ran 16k full trials. So the probability of coming out with the answer 5 is (5/8 . 4/8 . 3/8 . 2/8 . 1/8)^16384 which is 120 * 2^-245760 which is well below the Heisenberg limit of 6.6 * 10^-34 at which point you probably need to start worrying about the correct type of octonaut spontaneously popping into existence without having to visit Tescos.

Incidentally in 64k trails (65536) the worse results were 86, 83 and 81 purchases to get the full set.

## (no subject)

the-alchemist.livejournal.comat 12:45pm on 19/11/2013Also, this reminds me of a time when I was small and wanted charmkins, and my mother and aunt, going out shopping, said they would buy me one charmkin, and which did I want (Blossom), and (in case they didn't have her) which would be my second choice (Brown-eyed Susan)?

But as a Lovely Suprise, they in fact bought me both. And although I was grateful, I was also really sad, because why on earth did they assume that the Charmkin I wanted should Blossom not be available was the same as the Charmkin I would want to go alongside Blossom? In fact, I definitely wanted a humanoid Charmkin followed by a non-humanoid Charmkin, and having two humanoid Charmkins was (for reasons I cannot now remember) no better than having one humanoid Charmkin.

## (no subject)

atreic.livejournal.comat 01:05pm on 19/11/2013Also, I had never heard of Charmkins, and have just looked them up and they're so cute!

## (no subject)

aldabraat 12:46pm on 19/11/2013What's the deadline? Christmas? If there's time to iterate I'd order in a preliminary eight and see what I got. And then investigate eBay for low-faff swapping.

Although, last time I counted we had 17 Tescoi in Cambridge. What I'd actually do is cycle to the largest half dozen in order of accessibility and see what they had on the shelves, first.

## (no subject)

geekette8.livejournal.comat 12:56pm on 19/11/2013## (no subject)

atreic.livejournal.comat 01:06pm on 19/11/2013_17_ Tescoi? Wow.

## (no subject)

naath.livejournal.comat 01:11pm on 19/11/2013## (no subject)

atreic.livejournal.comat 01:31pm on 19/11/2013## (no subject)

aldabraat 03:59pm on 19/11/2013## (no subject)

See: http://tfwiki.net/wiki/Shortpacking

## (no subject)

pete stevensat 10:36pm on 19/11/2013source,

#!/usr/bin/perl

my $tests = 16384;

my @trials;

for (my $i = 0; $i < $tests; $i++) {

my $trials = 0;

my @oct = ( 1,1,1,0,0,0,0,0);

while ((eval join '+', @oct) != 8) {

$oct[int(rand() * 8)] = 1; $trials++;

}

$trials[$trials]++;

}

my $c = 0;

for (my $i = 0; $i < 50; $i++) {

if (defined($trials[$i])) { $c += $trials[$i]/$tests; }

print "$i : " . $trials[$i] . "(" . $trials[$i]/$tests . ") - " . $c . "\n";

}

Cumulative probability distribution

# ./oct.pl

0 : (0) - 0

1 : (0) - 0

2 : (0) - 0

3 : (0) - 0

4 : (0) - 0

5 : 47(0.00286865234375) - 0.00286865234375

6 : 180(0.010986328125) - 0.01385498046875

7 : 350(0.0213623046875) - 0.03521728515625

8 : 545(0.03326416015625) - 0.0684814453125

9 : 729(0.04449462890625) - 0.11297607421875

10 : 840(0.05126953125) - 0.16424560546875

11 : 891(0.05438232421875) - 0.2186279296875

12 : 914(0.0557861328125) - 0.2744140625

13 : 952(0.05810546875) - 0.33251953125

14 : 929(0.05670166015625) - 0.38922119140625

15 : 913(0.05572509765625) - 0.4449462890625

16 : 870(0.0531005859375) - 0.498046875

17 : 791(0.04827880859375) - 0.54632568359375

18 : 800(0.048828125) - 0.59515380859375

19 : 744(0.04541015625) - 0.64056396484375

20 : 642(0.0391845703125) - 0.67974853515625

21 : 633(0.03863525390625) - 0.7183837890625

22 : 521(0.03179931640625) - 0.75018310546875

23 : 459(0.02801513671875) - 0.7781982421875

24 : 387(0.02362060546875) - 0.80181884765625

25 : 370(0.0225830078125) - 0.82440185546875

26 : 361(0.02203369140625) - 0.846435546875

27 : 331(0.02020263671875) - 0.86663818359375

28 : 263(0.01605224609375) - 0.8826904296875

29 : 228(0.013916015625) - 0.8966064453125

30 : 197(0.01202392578125) - 0.90863037109375

31 : 187(0.01141357421875) - 0.9200439453125

32 : 147(0.00897216796875) - 0.92901611328125

33 : 165(0.01007080078125) - 0.9390869140625

34 : 102(0.0062255859375) - 0.9453125

35 : 98(0.0059814453125) - 0.9512939453125

36 : 102(0.0062255859375) - 0.95751953125

37 : 76(0.004638671875) - 0.962158203125

38 : 85(0.00518798828125) - 0.96734619140625

39 : 72(0.00439453125) - 0.97174072265625

40 : 60(0.003662109375) - 0.97540283203125

41 : 53(0.00323486328125) - 0.9786376953125

42 : 41(0.00250244140625) - 0.98114013671875

43 : 47(0.00286865234375) - 0.9840087890625

44 : 33(0.00201416015625) - 0.98602294921875

45 : 30(0.0018310546875) - 0.98785400390625

46 : 25(0.00152587890625) - 0.9893798828125

47 : 25(0.00152587890625) - 0.99090576171875

48 : 24(0.00146484375) - 0.99237060546875

49 : 23(0.00140380859375) - 0.9937744140625

## (no subject)

pete stevensat 10:37pm on 19/11/2013## (no subject)

sain-bano.livejournal.comat 03:37am on 22/11/2013(I wanted Dashi, Shellington, Inkling, Tweak and Tunip. I got 4 Kwazii's, 3 Barnacles, Peso, Shellington and Inkling)

## (no subject)

atreic.livejournal.comat 08:01am on 22/11/2013[Also, I wonder if we should adjust our model given that both narratively and based on the data we have, we expect many more Kwazii's and Barnacles...]

Good luck! Do you want me to have a snoop in bar hill tesocs? I'm there most weekends, but maybe posting a Tweak is more faff than it's worth...

## (no subject)

sain-bano.livejournal.comat 03:20pm on 22/11/20137 Tweaks, 3 Tunips, 3 Pesos, 2 Kwaziis, 2 Dashis, 2 Shellingtons and an Inkling

Interestingly, it doesn't lend more weight to the hypothesis about there being more Barnacles and Kwaziis.

## (no subject)

atreic.livejournal.comat 03:22pm on 22/11/2013I needed some good news, now I'm grinning :-D

## (no subject)

samholloway.livejournal.comat 06:41pm on 22/11/2013