What do you mean you're out of cupcakes?!
A look at the "dual marginalization" effect and how contracts can be used to mitigate its impact to profits and consumers.
This week I’m bringing back the Squareholder Value cupcake stand. The goal is to see how introducing a partner in the supply chain impacts operations management decisions. Before I start, let me show you what ChatGPT thinks a “delicious cupcake” looks like. I’m not saying I wouldn’t eat it, but it’s interesting that this is its archetype of a cupcake. Also, take the stem off the strawberry at least. And what’s keeping that orange sprinkle from falling off the cake?
Anyway, let me remind you of the economics of the cupcake stand. The baker sells cupcakes for $3 a pop. Each cupcake costs $1 to make. Daily cupcake demand is normally distributed with mean 100 and standard deviation 20. At the end of each day, the baker must throw out any unsold cupcakes. Under these assumptions, the newsvendor model says that the baker should bake ~109 cupcakes each night to maximize expected profit. The reason you bake more cupcakes than the number you expect to sell is that the profit from each cupcake is greater than the cost of baking an unsold cupcake. Therefore, it makes sense to err on the side of overproducing.
Bringing on a distributor
Now suppose that the baker doesn’t feel like sitting at the stand all day slinging cupcakes. Instead, they decide to sell the cupcakes through a distributor. The cupcakes still retail for $3. (The customers don’t care who sells them.) The trade-off is that the baker will need to sell the cupcakes to the distributor for less than $3 to allow the distributor to make a profit. Let’s say that the distributor agrees to buy the cupcakes upfront for $2 each, which would give them a profit of p = ($3 - $2) = $1 per cupcake. In this scenario, the baker would be happy to bake as many cupcakes as the distributor would buy. Indeed, the baker makes ($2 - $1) = $1 per cupcake, and there is no risk of overproducing because unsold cupcakes are the distributor’s problem.
What about the distributor? As stated above, their profit is $1 per cupcake. This is also the cost of underage—that is, the money they miss out on for each cupcake that a customer wants to buy after they sell out. On the other hand, the cost of overage—the money they lose per unsold cupcake—is $2, the price at which they purchase the cupcakes from the baker. Per the newsvendor model, the number of cupcakes the distributor should buy is determined by the critical fractile
This means that the distributor maximizes expected profit by purchasing cupcakes at the ~33rd percentile of demand. This is given by NORM.INV(1/3, 100, 20) = 91.4 cupcakes. (Note that this is less than the mean daily demand of 100 cupcakes.) Notice what happens when we introduce a channel partner into this model. Previously, the baker maximized expected profit by baking 109 cupcakes. At that level of output, expected profit is ~$183 (cf., the table from the first cupcake article). Now, with the distributor pre-purchasing the cupcakes, only 91 cupcakes are baked. This means that the profit to the baker is $91 ($1 per cupcake). You can show (cf., appendix) that the expected profit to the distributor is ~$78. This means that the total expected profit to the system is $91 + $78 = $169, which is $14 less than before.
This phenomenon is known as channel distortion due to double marginalization (sometimes just the “double marginalization” effect). Whenever you have two firms at different levels of the supply chain, each will want to maximize their profits. In our case, this reduced total profits and negatively impacted consumer welfare; there are 18 fewer cupcakes available for purchase. For a different good or service, you can imagine how this might be a legitimate problem.
So what can we do about it?
The double marginalization problem is really about misaligned incentives. Because the distributor stands to make less off each cupcake than the baker would alone, they were incentivized to stock fewer cupcakes than would be optimal for the system. It’s important to note that the distributor calls the shots—the baker can’t force them to buy more than they want to. The root of the problem is that the distributor needs to buy the cupcakes up-front and marked-up, thereby increasing their overage cost. Therefore, the solution is to use some sort of contract to reduce distributor’s cost of overage.
One option is to institute a buy-back contract. Under this arrangement, the baker will agree to buy back any unsold cupcakes at a pre-arranged price. This reduces risk to the distributor because they can return unsold cupcakes (albeit for less than they paid for them). This will certainly entice the distributor to buy more cupcakes. Exactly how many more they’ll buy depends on the buy-back price b. For example, to get to optimal total system profit (~109 cupcakes), b should be set so that the critical fractile is 2/3, as it was before we introduced the channel partner. The cost of underage Cu is still $1 (lost profit on a cupcake). This time, the cost of overage Co is $2—the cost of the cupcake to the distributor—less b, the buy-back price. Solving for b leads to a buy-back price of $1.50:
This would give the system the expected profit it wants. However, the baker wouldn’t be stoked about this arrangement. Remember that the buy-back shifts overage costs from the distributor to the manufacturer (i.e., baker). The latter’s cost of overage is exactly the buyback price of $1.50. The cost of underage is the baker’s profit on one cupcake, i.e. $1. This leads to a critical fractile of 1 / (1 + 1.5) = .4, meaning that the baker would want to produce under 100 cupcakes.
Instead, a better buy-back price would be one that aligns the critical fractile (i.e., optimal quantity of cupcakes) for both parties.
With a buy-back price of $1, the critical fractile for bother the baker and distributor is 1/2, which means that the distributor would buy exactly 100 cupcakes (the average daily demand). This is still less than the system wants (109), but it does mitigate some of the channel distortion.
The other type of contract that the baker and distributor could employ is a revenue-sharing agreement. As the name suggests, this is where the distributor shares a percentage of cupcake revenue with the baker. In exchange, the baker reduces the wholesale price of the cupcakes to a level much closer to the cupcake cost. For concreteness, let’s say that the baker now sells the cupcakes to the distributor for $1.05 each, which gives them a margin of $1.05 - $1 = $0.05 per cupcake. While this sounds like a small number, it does eliminate the downside risk for the baker by getting rid of the buy-back provision. Also, they will earn a percentage of sales, say r, from the manufacturer. As before, we can use the newsvendor model to inform what r should be. For the baker, the cost of overage is 0, so they will make as many cupcakes as the distributor is willing to buy. For the distributor, the cost of underage is the profit on each cupcake, which is (1 - r)*($3) - $1.05. The cost of overage is $1.05, the cost of each cupcake. This yields a critical fractile of
There’s no value of r > 0 that makes this expression 2/3.1 However, you can get pretty close if r is small enough. For example, if r = 10%, then the critical fractile is .611, which means the distributor would buy about 106 cupcakes—just three shy of optimal system profit. How would profit break out in that case? The baker would make a nickel on the sale to the distributor, plus $0.30 for the revenue-sharing portion, for a total of $0.35 per cupcake. This means that the distributor makes $1.65. In other words, the baker needs to give up most of their profits in order to shed all of the risk.
If you’re the baker, how do you decide which type of contract to pursue? You should know by now that I don’t do recommendations. However, here are some things to think about.
How much downside risk can you tolerate? With a buy-back contract, the baker is on the hook for unsold cupcakes. Conversely, under a revenue sharing contract, the distributor is responsible for unsold cupcakes.
What type of profit margins do you require? While each contract type has parameters, it appears that the revenue sharing model is generally less profitable for the baker. This is because they have to knock so much off the wholesale price.
Are you able to track the distributor’s sales? Under a revenue-sharing model, you make a tiny profit by selling the distributor the cupcakes. Much more of your profit is going to come from the share of retail sales. Therefore, it’s crucial that you can monitor retail sales to ensure you get what you’re owed.
These are just a few things to think about if you’re considering bringing in a partner to help sell your profit. I hope you enjoyed reading. Please subscribe and refer your friends to Squareholder Value.
Appendix: Calculating the distributor’s expected profit
To calculate expected profit for the distributor, we first need to define the profit function based on a quantity of 91 cupcakes purchased. Profit depends on demand X, which is random.
(If demand is greater than 91, then the distributor makes $1 on each of 91 cupcakes. If demand X is less than 91, then they make $1 on the first X cupcakes and lose $2 on the remainder that they couldn’t sell.) To get expected profit, you integrate the profit function against the PDF of the normal distribution. Instead, you can also use a discrete approximation to the normal distribution and then compute the weighted average of profit with the associated probability. This makes sense because you only bake an integer number of cupcakes, whereas the normal distribution is continuous. Using a discretized version of the normal distribution, I computed an expected profit of ~$78 for the distributor if they purchase 91 cupcakes (cf., picture below). Note that this picture also verifies that 91 cupcakes is optimal, because profit goes down if the distributor buys more or less than that.

If you’re wondering, the “exact” answer given by the integral was 78.18, so the preceding approximation is more than adequate. The discretized version of the normal distribution is given in column B. The function NORM.DIST has four arguments: the first is the x-value for which you want the probability. The second and third are the mean and standard deviation, respectively. The fourth argument is 1, which denotes “cumulative.” In other words, we’re taking the area to the left of the value in the first argument. For the demand level of X = 256 cupcakes (row 260), I inputted the demand from row 261 (257 cupcakes) -.5, which equals 256.5. I then subtracted the area to the left 255.5. The result is the probability of selling between 255.5 and 256.5 cupcakes, which is a continuous way of saying the probability that 256 cupcakes are demanded.
It would be possible to match the original output level is there were a salvage price—that is, a deeply discounted price at which you could get rid of unsold cupcakes. Otherwise, unless the baker sold the cupcakes below cost (i.e., <$1), there is no value of r that could entice the distributor to buy as many cupcakes as the baker would bake without the partnership.
Interesting newsletter.
I am a new subscriber. I am looking forward to reading your future articles and catching up on your past offerings.
In the cupcake scenario I see another solution that may be beneficial. Since the pain points for the baker and distributor are 9 cupcakes either side of the optimal 100 cupcakes, suppose the baker offers a discount only on the cupcakes above the distributor's pain point. The first 91 cupcakes are $2 each and each additional cupcake is $1.50 so the risk is more evenly distributed. I doubt my back of the napkin calculations are correct but, does the concept make sense to you?