Would you like fries with that?
A look at the coupon collector problem: a useful model that puts some math behind McDonald's Monopoly promotion.
Do you remember the Monopoly promotion that McDonald’s used to run? In a nutshell, certain menu items—usually of the supersized variety—came with a sticker that you could peel off to reveal a property from the board game Monopoly. Once you collected all of the properties of the same color, you could redeem those stickers for a prize.1 The prizes were better, and the stickers rarer, for the more expensive properties like Boardwalk and Park Place.
According to Wikipedia, McDonald’s Monopoly promotion first appeared in 1987. It’s still going strong in parts of the world, so it must make the company a lot of money.2 McDonald’s makes money by drawing in more customers and (especially) by upselling: you get more stickers for pricier items. That’s great, but you have to consider the cost of the promotion too. Each time it runs this promotion, McDonald’s gives away hundreds of millions of stickers, which equates to lots of money in giveaways. It’s probably relatively easy to estimate the cost of instant prizes, like free food. But what about the prizes for accumulating properties? This is trickier because it involves one customer visiting McDonald’s multiple times. If a given customer has acquired Indiana Ave (red) and North Carolina Ave (green), then there is no risk to McDonald’s of them finding Park Place (blue). However, if that same customer has Boardwalk (blue), then McDonald’s is exposed.

The coupon collector problem
I want to look at the math of this problem. To simplify things, let’s assume that a person wins a random ticket each time they visit a store. There are four tickets: red, green, blue, and yellow. Each ticket is equally likely to be picked on each visit to the store. When a customer receives all four tickets, they can redeem them for the grand prize. The store offers this promotion to get people to shop more. The operative question is therefore:
How many times does someone need to visit the store before they win the grand prize?
This question is important for two reasons. First, the number of visits needed can help the vendor price the grand prize. For example, if each visit creates $10 of profit for the company, and someone wins the prize after visiting six times, then you don’t want to give away $10,000 for collecting four tickets. Second, once a person acquires all four tickets, presumably their spending will drop off again. Knowing how many visits it will take to earn all four tickets therefore gives you an idea of how much additional profit you can generate.

Because the ticket you receive at each visit is random, we can’t know for sure how many visits it will take to collect them all. It could be as few as four, if you get a different ticket each time. On the other hand, it’s also possible that you keep getting blue tickets and never find the other three. What we can do is compute the expected number of visits needed to collect all the tickets. This problem turns out to be trickier than it sounds.
Let’s start by asking how many times you need to visit the store to get a blue ticket. On any given visit, the probability is 1/4 that you get a blue ticket, and 3/4 that you don’t. If X describes how many visits are needed to get a blue ticket, then P(X = 1) = 1/4. To get a blue ticket on your second ticket means that you didn’t get one on the first visit, so P(X = 2) = (3/4)*(1/4). We can multiply probabilities like this because each draw is independent of the next one. Getting a blue ticket on your third attempt means that you didn’t get one on the first two, so P(X = 3) = (3/4)*(3/4)*(1/4). In general, if the probability of getting a blue ticket is p, then the probability of needing n visits to get the first ticket is
This is the probability mass function for the so-called geometric distribution. By definition, the expected value (i.e., mean) E[X] is the sum of n*P(X = n) over all possible values of n. Unfortunately, this leads to an infinite series for the geometric distribution:
This almost looks like the geometric series that I talked about in my mortgage payment post. However, the n in front complicates things. It turns out that you can sum this series, but I prefer the following slicker argument for finding the expected value. On your first visit to the store, you either get a blue ticket (with probability p) or you don’t (with probability 1 - p). In the former case, it took you n = 1 visits to get a blue ticket. In the latter case, the expected number of visits to get a blue ticket is one more than the expected number of visits needed to get a blue ticket from the onset. Why? Because each visit to the store is independent of the others, so the first visit has no impact on which tickets you’ll see in future visits. If you think of expected value as a weighted average, then you should believe the law of total expectation, which says that the expected number of visits needed is equal to the weighted average of these two cases (weighted by the probability of each case). This tells us that
which can be solved for E[X] to get E[X] = 1/p. In our case, p = 1/4, so E[X] = 4. The same argument tells you that you expect to have to flip a coin twice to get heads once or roll a die six times to get one three.
Blue is just a name, so the expected number of visits to get one red ticket (or green ticket or yellow ticket) is also four. Does that mean that the expected number of visits to get one of each is 4 + 4 + 4 + 4 = 16? Not quite. It should be less than that. If you visit the store four times to get a blue, then you bagged three additional tickets, which means that you wouldn’t need to make those four expected visits for at least one of the other colors. This is the tricky part. In statistics class, you learn that expectation is linear, meaning that if X and Y are two random variables, then E[X + Y] = E[X] + E[Y]. (This extends to four variables to cover our example.) Why doesn’t that apply to the ticket problem then? The knee-jerk reaction3 is to say that X and Y are dependent, so the formula breaks down. However, that is false. E[X + Y] = E[X] + E[Y] holds whether X and Y are independent or not. The real breakdown is that the distribution we want—the number of visits before you get one of each color ticket—is not the same as the sum of four copies of the distribution for visits until you get one copy of a particular color.
So where do we go from here? The strategy of decomposing the experiment into smaller pieces is actually correct; we just need to pick better pieces. Instead of considering each color separately, let’s consider the number of visits until you get your first, second, third, and fourth new colors. More precisely, let X_i be the number of visits needed to collect the ith color ticket after you have collected (i - 1) unique colors. If we define X to be the number of visits to get all four colors, then
Now we need to know the probability of getting a unique color in each case. On your first visit, it is 100%: all four colors would be new. On the second visit, three of the four colors (=75%) are new. After you have two colors, two of the four (=50%) remaining tickets are a new color. Finally, once you have three of the four colors, only one of the four (=25%) of colors would be new. Note that each X_i has a geometric distribution. As shown above, the expected number of visits to collect a new coupon is given by the inverse of the probability. It follows that
This problem is called the “coupon collector’s” problem. In addition to McDonald’s promotion (which, by the way, is more complicated due to uneven probabilities of the game pieces), it also has applications in fault detection in electrical engineering, marketing (e.g., customer acquisition), mathematical optimization, and more.
What if there are lots of tickets?
I have a confession to make. This week, I looked for a business application to justify showing you math I wanted to show you. The real inspiration for this week’s post is a gift that my three-year-old got from his aunt and uncle: a 20-sided die. (What can I say? The kid’s really into numbers.) We were rolling the die for a long time and recording the results. This got me thinking about how long it would take to roll all 20 numbers. The answer is what you would expect based on the previous example:
The one difference is that this time it is tedious to add up all the numbers. Of course, you could do it with a calculator. But I don’t have to explain to you, dear reader, why that’s unsatisfying. So, is there a formula that gives the solution to the coupon collector problem with n equally likely color tickets? Not exactly. But there is a very close approximation. It’s based on the observation that the sum 1 + 1/2 + … + 1/n is a Riemann sum for the integral of the function 1/x (cf., picture below).

The antiderivative of 1/x is ln(x), the natural logarithm. This helps us relate the relevant sum to the log function. Since 1/x is a decreasing function, a left Riemann sum will overestimate in the area under the curve. (Notice that the yellow rectangles contain the area under the curve and then some.) Conversely, a right Riemann sum will underestimate the integral. If you write out the integral and use these bounds, you can show that, for any n,
ln(n + 1) and ln(n) are basically the same for large enough n, so this inequality tells us that the sum in the middle is within 1 of ln(n). That would probably be good enough, except that we have to multiply the series by n in order to get the expected value in the coupon collector solution. (Check out the 20 multiplying the series in the beginning of this section.) Doing so would amplify that error. It’s too involved for this post, but a little more calculus shows that
where γ = .577… is called the Euler-Mascheroni constant. Since “stuff” grows more slowly than n, we can now safely write the solution to the n-ticket coupon problem as
Here are a few values of both the actual series sum E[X] and the log approximation. As you can see, the convergence is very fast.
In particular, for the 20-sided (a.k.a., icosahedral) die, about 72 rolls are needed, on average, to roll each number at least once. On the day in question, we hit each number several times. (Just to give you an idea of how long we played that game.)
As always, thank you for reading. Referrals are the best way to grow my audience, so please share if you enjoy my posts!
They also had “instant win” tickets, that you could redeem on the spot for a food item. Those are less interesting for this post, though.
I think it was discontinued in the U.S. due to a massive fraud, wherein executives from the company McDonald’s hired to administer the promotion stole the rarest prizes for themselves. Can’t make this stuff up…
Incidentally, it’s also the explanation that ChatGPT gave for why the answer is not 16. Trust but verify!