4
$\begingroup$

My friend and I were arguing for way too long the other night about how much it would cost you to buy every single thing in a grocery store. Our first go at it went something like this:

Assume there are $N_{\text{items}}$ items per row in the grocery store, and let $p_{\text{avg}}$ be the average price for each item. Then say that there are $N_{\text{rows}}$ rows. Multiplying this out we get a total price $P_{\text{total}}$ as

$$ P_{\text{total}} = N_{\text{items}}p_{\text{avg}}N_{\text{rows}}$$

The only issue is, there is a vast range of difference prices for items, and vast ranges of items per row, depending on what row you're in. For instance, if you go down the aisle with all the spices, there's a ton of items at very low cost, but the coffee aisle has a lot of items at very high cost; the meat aisle has relatively average number of items at a much higher cost, as well as the kitchen-utensils/kitchenware aisle etc.

This got me thinking that there must be a better way to do an accurate estimation for a problem like this. Perhaps come up with some sort of intelligent distribution for prices (I was thinking maybe a log-normal distribution with a maximum around some arbitrary "most-probable" price, based on observation). And possibly do the same thing with the number of items per row? Estimating $N_{\text{rows}}$ is relatively straight forward since most grocery stores have somewhere between ten and twenty rows, so letting be $N_{\text{rows}}$ Gaussian centered at ten should take care of that, if we even want to get that fancy with that variable.

Anyway, I'm not that savy with probability/statistics in the first place, so I thought I would ask you brilliant people: how would you most intelligently try to take a stab at this estimation?

  • 0
    The trouble with a log normal distribution is that it would place the mean greater than the median, which is not an accurate model. The converse will be true because the store will have a few expensive items, which are likely to be bigger than the cheaper items and stored in lower quantities.2017-02-08

6 Answers 6

2

Your question is what's known as a Fermi Problem: a question about estimating a large or unwieldy number, such that answering the question calls for some rough estimates and back of the envelope calculations. I think that your attempt, and dxiv's and Producer of BS's responses are all pretty good (although theirs' would require more leg-work than yours. I'll give this a shot too, and I'll go with an approach similar to yours:

I want to pick a single number to be the average value of all the prices in the store. Now if the actual values of the prices follow Benford's Law as Joce suggests, then we could pick the mean of a random variable obeying Benford's law, which is $3.44$. But this only cleanly works if you're considering items priced between \$1.00 and \$9.99. There are items priced lower and higher than that. I'm willing to ignore the lower priced items because they won't matter as much, but there are enough items with price greater than \$10, like meats, kitchen utensils, and saffron, that I don't want to ignore those. So for the sake of not thinking too much harder about this, let's just bump that \$3.44 estimate up to \$4 even to account of the expensive items. In the end, this tiny estimate won't matter much anyways in comparison to the other estimates that have to be made.

I want to say there are twenty rows in a typical store. There are really closer to fifteen, but you've got to account for those outer areas that aren't rows, like the butcher counter and bakery and produce and floral too.

Now I want to break down the number of items in each row more precisely. Each row (besides maybe the outer two) has two sides, is about 70 feet long, has an average of 10 shelves, and each shelf is about 2 feet deep. Now we just need an estimate for the number of items on the typical square-foot of a single shelf. Thinking about Benford's Law again, and assuming that most products come in packages that have a roughly square base, I'd estimate there to be $3.44^2$ items per square-foot of shelf. Multiplying all of these we get that there are about

$$2 \times 70 \times 10 \times 2 \times 3.44^2 \approx 33,\!134 $$

items per row. Then we calculate the total price to be

$$\begin{align} \mathrm{TotalPrice} &= \mathrm{AverageItemPrice}\times \mathrm{NumberOfRows}\times \mathrm{ItemsPerRow} \\ &= \$ 4\times 20\times 33,\!134 \\ &= \$ 2,\!650,\!720 \end{align} $$

1

Fill a cart with random items, and check its price. Then eyeball the total shelf room and guesstimate the number of carts required to haul everything. Multiply the two.

1

Take a look at the end of the day at the proportion empty the shelves are before they start restocking. This will enable you to work out the proportion of a full shop's stock which is sold each day. Then look up the store's turnover in their accounts, and divide that by 365.

The ratio of these two numbers gives you the exact total value of stock within a fully stocked shop, provided that the average time of any given product on the shelf is independent of its volume/price ratio.

1

As you note,

there is a vast range of difference prices for items, and vast ranges of items per row

Indeed, most of the distributions we encounter in daily life are not uniform distributions over an interval. It is from this observation that Benford's law has been developed.

So to model the prices, and possibly also the size of the items on the shelves, I think you should use e.g. a log uniform law, which obeys Benford's law.

0

Get the latest financial statement from the company that owns the store. The total value of inventory (as of a certain date) should be listed. Divide by the number of stores owned.

-2

Enter the grocery shop and say "I want everything.". The answer is the price they ask you to pay.