1
$\begingroup$

Elementary Statistics (Larson 4th ed) gives this example of Poisson, typical in my opinion. The mean number of accidents per month at a certain intersection is three. What is the probability that in any given month four accidents will occur at this intersection?

That's all that's given, and somehow it's sufficient to find the probability. But it seems common sense that some intersections (with an average of three per month) are more consistently three per month and therefore rarely see four in a month. While other intersections (with an average of three per month) see two, three or four with equal frequency, and often see four in a month. Poisson claims that this is not true: both of these intersections have the same probability of four accidents in the next month, because they both average three per month. How is it possible that this question does not arise, we need not measure the variance, and we just push on?

I'm asking for an explanation that is (as nearly as possible) in English sentences rather than computations.

2 Answers 2

2

The Poisson distribution is based on a fundamental assumption:

For a "short" period of time $\Delta t$, the average number of events $\Delta N$ that occur is proportional to $\Delta t$: $\Delta N = \lambda \Delta t$. Moreover, this proportionality $\lambda$ is independent of time and of the number of events that have already occurred.

To see how this constrains the distribution of events in a given interval, let's look at a different distribution with the same mean: the intersection always has three accidents per month. No more, no less. In other words, the variance is exactly zero. Now suppose, on April 1st, it so happens that an accident occurs. If it is to be the case that there are exactly three accidents in that month, then the chance that there will be on an accident on any of the remaining 29 days in the month must now be lower than average: it would be something like $\frac{2}{29}$ rather than the $\frac{3}{30}$ that one would expect. In other words, the probability for "an accident will occur on the 2nd given that one occurred on the 1st" is different than the probability that "an accident will occur on the 2nd." This violates the postulate that the probability of an event during a "short" period of time (in this case, a day) is independent of any previous events that have occurred.

One could similarly construct an example where no accidents occur for the first 29 days of April. (This is improbable, but possible.) In that case, there must be three accidents on the 30th; the probability of an accident (in fact, three accidents!) has now shot up to 1. Again, this violates the principle that the probability of an event during a "short" period of time is independent of previous events.

This isn't a completely satisfying argument, since all it does is prove that the variance of a Poisson process cannot be zero. But hopefully it shows that the principle underlying the Poisson distribution does actually constrain its variance: the mean and the variance cannot be specified independently of each other.

  • 0
    I don't see why the probability changes as events unfold. That seems like a gambler's fallacy. The original probability is not deliberately produced by drivers intent on a certain number of accidents. It is a random event. As a psychological matter it is hard to conceive of a random event with a variance of zero, but randomness the premise of the whole discussion. Isn't it? In any case the argument seems backwards to me: are you saying that no two intersections have equal means but different variances?2017-01-20
  • 0
    @Chaim: That's precisely my point: If the variance in the accident rate per month was zero, then we would have to conclude that events happening earlier in the month would influence events later in the month. In other words, some form of "gambler's fallacy" would actually hold true.2017-01-21
  • 0
    @Chaim: Now, it's entirely possible to conceive of a situation where this would be the case — for example, maybe drivers are more careful after an accident occurs, and so $\lambda$ would decrease for a while after each accident. In such a case, a form of "gambler's fallacy" *would* hold, since past events are influencing the probability of future events. But then the accident rate at that intersection wouldn't be governed by a Poisson distribution, and the monthly accident rate would have a different variance than the variance of a Poisson distribution.2017-01-21
  • 0
    The supposed need to maintain a given mean _seems_ to violate independence no matter what the variance is, because it dictates that below-average data must eventually be offset with above-average data and vice versa. This must all be wrong. It must be as someone was telling me last week in remarks apparently deleted -- that not all situations are Poisson, and when we call a situation Poisson we're claiming that it actually behaves in certain ways, with variance related to mean in certain ways. I wonder what causes that relationship to exist where it does.2017-01-23
1

The mean and the variance of the Poisson distribution cannot be chosen independently. If the mean is $3$ then the variance is also $3$. (If the mean is $\lambda$ then the variance is also $\lambda.$) And the probability that $k$ axidents occur is

$$\frac{\lambda^k}{k!}e^{\lambda}.$$

Even the formula shows that there is only one parameter.

So, if the Poisson approach is good enough to describe the statistics of an intersection from the point of view of axidents then there are no different intersections of the same mean.


The statement that the Poisson distribution is not good enough is not a mathematical statement.

You have to go out to find two different intersections and do measurements. If you find that the average number of accidents is different then go and find another intersection. If finally you find two "equimean" intersections then calculate the relative frequencies of the number of accidents.

If the relative freqeanties of the same number of accidents are significantly different then the Poisson distribution is not appropriate in the case of your road crossings.

There are phenomena in the case of which the Poisson distribution works quite well. Such a case is the number of accidents at road crossings.

  • 0
    What? Are you saying that the two intersections in my initial question would have the same variance if we measured it (using a sum-of-squares)? That seems wrong. The alternative you leave me is that the Poisson approach is not good enough. But how can we know when that Poisson approach is good enough? I realize that the formula makes no reference to the variable I want; that's my problem.2017-01-20
  • 0
    @Chaim: Please, see my addition to my answer.2017-01-20
  • 0
    To emphasize this a different way, if one intersection has a mean of three per month but sees three per month very consistently, and if another intersection has a mean of three per month but sees two, three, and four per month with equal frequency, then *one of those is **not** the Poisson distribution*.2017-01-20
  • 0
    @LeeMosher: Thank you for improving.2017-01-20
  • 0
    Then how can the text book tell us just mean=3 and ask P(4)? Our answer will just be one of many possible probabilities, special only in that "it is the Poisson distribution"?2017-01-20
  • 0
    @Chaim: The book does not like smart students : ) The teacher must have already tell that road crossings work according to the Poisson distribution. The book does not expect students asking the deeper question: "Can the accident statistics of a road crossing be described by the Poisson distribution?" If you asked your question in the class room then the answer would be this: "Yeah, Yeah, but then assume that the distribution is Poisson."2017-01-20
  • 0
    In the example presented in your question, your job is to learn *apply* the Poisson distribution. For a Poisson distribution whose mean=3, one may calculate. $P(4)=0.168031355741541$. For example, I calculated this using an on-line Poisson distribution calculator. This is not "one of many possible probabilities", it is the number that it is. On the other hand, certainly it is very good to ask "Why do we expect that the mean number of accidents per month follows a Poisson distribution? Why can we assume that". That is a very good question, but, it is a different question than the one you asked.2017-01-20
  • 0
    Then let's proceed to the question you propose. We agree that if Intersection A sees 2, 3, 4, 2, 3, 4, etc., while Intersection B seems 3, 3, 3, 3, 3, 3, etc., then they will have the same mean but not the same probability of 4 collisions next month. But, you say, real intersections, and all other situations properly described by Poisson, don't really behave like these examples. The Poisson Distribution, like Normal Distribution, is a claim about how certain real events occur. Is that it?2017-01-20
  • 0
    Yes, a claim which may be fase. We have been sailing on twe waters of philosophy of science. Newton's laws of mechanics are only approximations. Yet they can be used in certain situations. The Poisson law of the frequency of accidents is an approximation as well. It is the job of the statistician to control the use of such approximations. (I did not say "properly" I am talkibg about good better worse and wrong approximations.)2017-01-20