POISSON PROBABILITY DISTRIBUTION (EXAMPLES)

Link to summary slide and video transcript below

EXAMPLE 1

Let's consider a situation in which the mean number of observations in a given region is 2.0. What are the odds that we would end up having 0, 1, 2, 3, 4, and more than 4 observations?

To calculate these probabilities we use the equation:

$$ Pr(x) = {{\mu^x e^{-\bar{x}}} \over {x!}} $$

To use the equation above we use an overall mean number of observations of 2. This gives us the following: $$ Pr(0) = {{2^0 e^{-2}} \over {0!}} = {{1} \over {(0!)e^2}} = {{1} \over {e^2}} = 0.1353 $$ $$ Pr(1) = {{2^1 e^{-2}} \over {1!}} = {{2} \over {(1!)e^2}} = {{1} \over {e^2}} = 0.2707 $$ $$ Pr(2) = {{2^2 e^{-2}} \over {2!}} = {{4} \over {(2!)e^2}} = {{2} \over {2e^2}} = 0.2707 $$ $$ Pr(3) = {{2^3 e^{-2}} \over {3!}} = {{8} \over {(3!)e^2}} = {{6} \over {6e^2}} = 0.1804 $$ $$ Pr(4) = {{2^4 e^{-2}} \over {4!}} = {{16} \over {(4!)e^2}} = {{24} \over {6e^2}} = 0.0902 $$ $$ Pr(>4) = 1 - 0.1353 - 0.2707 - 0.2707 - 0.1804 - 0.0902 = 0.0527 $$ We can see that the most likely number of observations is either 1 or 2 which makes sense because the average is 2.

Another way to get these probabilities would be to calculate the first one and then use the shortcut equation relating sequential Poisson probabilities:

$$ Pr(x+1) = \left({{\mu } \over {x+1}}\right) Pr(x)$$

We use the equation above to get the first value: $$ Pr(0) = {{2^0 e^{-3}} \over {0!}} = {{1} \over {(0!)e^2}} = {{1} \over {e^2}} = 0.1353 $$ From now on we use the shortcut equation: $$ Pr(1) = \left({{2 } \over {1}}\right) 0.1353 = 0.2707 $$ $$ Pr(2) = \left({{2 } \over {2}}\right) 0.2707 = 0.2707 $$ $$ Pr(3) = \left({{2 } \over {3}}\right) 0.2707 = 0.1804 $$ $$ Pr(4) = \left({{2 } \over {4}}\right) 0.1804 = 0.0902 $$ $$ Pr(>4) = 1 - 0.1353 - 0.2707 - 0.2707 - 0.1804 - 0.0902 = 0.0527 $$

EXAMPLE 2

The mean doesn't have to be a whole number, let's consider a situation in which the mean number of observations in a given region is 3.14. What are the odds that we would end up having 0, 1, 2, 3, 4, 5, and more than 5 observations?

To calculate these probabilities we use the equation:

$$ Pr(x) = {{\mu^x e^{-\bar{x}}} \over {x!}} $$

To use the equation above we use an overall mean number of observations of 1. This gives us the following: $$ Pr(0) = {{3.14^0 e^{-3.14}} \over {0!}} = {{1} \over {(0!)e^3.14}} = {{1} \over {e^3.14}} = 0.0433 $$ $$ Pr(1) = {{3.14^1 e^{-3.14}} \over {1!}} = {{3.14} \over {(1!)e^3.14}} = {{3.14} \over {e^3.14}} = 0.1359 $$ $$ Pr(2) = {{3.14^2 e^{-3.14}} \over {2!}} = {{9.8596} \over {(2!)e^3.14}} = {{9.8596} \over {2e^3.14}} = 0.2134 $$ $$ Pr(3) = {{3.14^3 e^{-3.14}} \over {3!}} = {{30.9591} \over {(3!)e^3.14}} = {{30.9591} \over {6e^3.14}} = 0.2233 $$ $$ Pr(4) = {{3.14^4 e^{-3.14}} \over {4!}} = {{97.2117} \over {(4!)e^3.14}} = {{97.2117} \over {12e^3.14}} = 0.1753 $$ $$ Pr(5) = {{3.14^5 e^{-3.14}} \over {5!}} = {{305.2448} \over {(5!)e^3.14}} = {{305.2448} \over {60e^3.14}} = 0.1101 $$ $$ Pr(>5) = 1 - 0.0433 - 0.1359 - 0.2134 - 0.2233 - 0.1753 - 0.1101 = 0.0987 $$ We can see that the most likely number of observations is 3 which makes sense because that's teh closest number to the mean.

Another way to get these probabilities would be to calculate the first one and then use the shortcut equation relating sequential Poisson probabilities:

$$ Pr(x+1) = \left({{\mu } \over {x+1}}\right) Pr(x)$$

We use the equation above to get the first value: $$ Pr(0) = {{3^0 e^{-3}} \over {0!}} = {{1} \over {e^3}} = 0.0498 $$ From now on we use the shortcut equation: $$ Pr(1) = \left({{3.14 } \over {1}}\right) 0.0498 = 0.1359 $$ $$ Pr(2) = \left({{3.14 } \over {2}}\right) 0.1359 = 0.2134 $$ $$ Pr(3) = \left({{3.14 } \over {3}}\right) 0.2134 = 0.2233 $$ $$ Pr(4) = \left({{3.14 } \over {4}}\right) 0.2233 = 0.1753 $$ $$ Pr(5) = \left({{3.14 } \over {5}}\right) 0.1753 = 0.1101 $$ $$ Pr(>5) = 1 - 0.0433 - 0.1359 - 0.2134 - 0.2233 - 0.1753 - 0.1101 = 0.0987 $$

Connect with StatsExamples here

LINK TO SUMMARY SLIDE FROM VIDEO:

StatsExamples-poisson-probability-examples.pdf

TRANSCRIPT OF VIDEO:

Slide 1

The Poisson probability distribution is a very useful tool for answering a variety of interesting questions, but it takes a little practice, so let's do some examples.

Slide 2

First, a quick recap of the basic scenario.
If this is new to you, there is another video on the same channel and playlist that introduces the Poisson probability equation.
When a process is Poisson distributed and we know the mean number of observations per unit timer area these equations give the probability of seeing X or X plus one successes in that unit time or area
The first equation is the foundation and states that the probability of seeing X events is equal to the mean number raised to the power X, times E raised to the power of the negative mean, all divided by X factorial.
The two equations there just indicate that sometimes you have population values and sometimes you're using sample estimates of the population values
The second equation is a shortcut equation, which can be derived from the first one fairly easily, which allows you to get other Poisson probabilities quickly once you have calculated at least one.

Slide 3

For our first scenario let's consider an example in which we cast a net into an ocean and we're estimating how many individuals of a rare fish species we expect to catch.
each net potentially catches thousands of fish but we know from experience that the average , that is the mean, is 3.
we want to know - what are the probabilities of nets that catch zero or one or two or three etc. fish from the species we're interested in.

Slide 4

To answer this question we're going to use the mean of 3 and the Poisson probability equation.
You can see that our starting equation, shown in red, when we plug in our mean gives us the probability of catching X number of fish will be 3 to the X, times E to the negative 3, divided by X factorial.
So, using this, the probability of catching 0 fish would be 3 to the zero power times E to the negative 3 divided by 0 factorial. 3 to the 0 is 1, and E to the negative 3 is 0.049787. The 0 factorial in the denominator is one by definition. multiplying this through gives us a probability of 0.049787.
The probability of catching one fish would be 3 to the first power, times E to the negative 3, divided by 1 factorial. 3 to the first power is 3, E to the negative three is 0.049787, and one factorial is one. Multiplying this out gives us a probability of catching one fish of 0.149361.

Slide 5

Continuing with our calculations let's calculate the next 2 two probabilities
The probability of catching 2 fish would be 3 squared, times E to the negative 3, divided by 2 factorial. 3 squared is 9, and E to the negative 3 is 0.049787. The 2 factorial in the denominator is 2. Multiplying and dividing this through gives us a probability of catching 2 fish of 0.2240.
The probability of catching 3 fish would be 3 cubed, times E to the negative 3, divided by 3 factorial. 3 cubed is 27, E to the negative three is 0.049787, and 3 factorial is 6. Multiplying this out gives us a probability of catching 3 fish of 0.2240.
That's a strange coincidence that these two probabilities would be exactly the same. Actually, it's not really a coincidence, it's occurring because our mean is a whole number.

Slide 6

let's look at the probabilities for two and three fish in a little detail. Both those probabilities were 0.2240.
Let's look at our short cut equation, shown in red. We can see that if we plug in an X of 2 so that we're using that equation to calculate the Poisson probability for 3 from the probability of 2, We end up with a 3 in the numerator and denominator of our fraction so they cancel and we expect the probability of 3 observations to be the same as the probability of 2 observations.

Slide 7

If we continued to calculate plus on probabilities these are the values we would get up through 7 observations and this is the probability distribution up through 12 observations.
We can also see that while it's theoretically possible to catch 9 or more fish, it's highly unlikely.

Slide 8

Sometimes we are interested in calculating the probability of a certain range of observations. For example, we might want to know what is the probability of a net having more than 3 fish.
We can't sum up all those probabilities because our distribution never truly ends. However, we know that the total of all the probabilities must be 1 and we could add up the probabilities for 0, 1, 2, and 3 and subtract those from one to get our answer.

Slide 9

If we know that we might be interested in questions about ranges in our probability distribution, we can calculate cumulative probabilities as well as individual probabilities.
We can keep track of these cumulative probabilities in parallel with our Poisson probabilities. For example, when we calculate the probability of 0 we can note that the probability of less than one will be the same value of 0.04979.
To figure out the cumulative probability for fewer than 2 observations we can just add the probability of one observation, 0.14936, to the cumulative probability for less than one, 0.04979, to get the value of 0.19915.
To figure out the cumulative probability for fewer than 3 observations we can just add the probability of 2 observations, 0.22404, to the cumulative probability for less than 2, 0.19915, to get the value of 0.42319.
To figure out the cumulative probability for fewer than 4 observations we can just add the probability of 3 observations, 0.22404, to the cumulative probability for less than 3, 0.42319, to get the value of 0.64723.
Using the exact same procedure we could calculate all the cumulative probabilities.
Once we have cumulative probabilities, answering a question like "what is the probability of a net with more than 3"? is fairly easy. The probability of more than 3 fish will be 1 minus the probability of less than 4 fish which is directly in our list of cumulative probabilities.

Slide 10

For our second example instead of thinking about taking individuals out of a population let's think about different sized populations and how many individuals that fit in a certain criterion they will have.
Let's consider small towns in the US and their medical needs. one type of specialty medical care involves professionals who can work with individuals with prosthetic limbs due to major amputations.
So practical question, when we're thinking about medical care in small towns, would be how many amputees would we expect in a town of 5000?
In theory each town could have hundreds of amputees, but the probable number will be much lower.
The estimated frequency of major amputees in the US is one out of every 445 people and that comes from a paper published in 2008. The citation for that paper is in the video description. This frequency equates to 11.25 individuals per 5000.
This would be our mean number of amputees we would expect if we looked at a bunch of towns of 5000 people.

Slide 11

When the mean is 11.25 and we want to calculate the probabilities of towns with 5, 10, 15, 20 etc amputees, our Poisson equation becomes probability of X equals 11.25 raised to the power X, times E to the negative 11.25, divided by X factorial.
The probability of a town having 5 amputees would therefore be 11.25 to the fifth, times E to the negative 11.25, divided by 5 factorial.
This would be 180,203.247 times 1.3007297, times 10 to the minus 5, divided by 120, which is a probability of 0.0195.

Slide 12

The probability of a town having 10 amputees would therefore be 11.25 to the tenth power, times E to the negative 11.25, divided by 10 factorial.
This would be 32 billion, 473 million, 210,255 - times 1.3007297 times 10 to the minus 5, divided by 3,628,800, which is a probability of 0.1164.

Slide 13

The probability of a town having 15 amputees would therefore be 11.25 to the fifteenth power, times E to the negative 11.25, divided by 15 factorial.
This would now generate numbers in the numerator and denominator that are extremely large, too large for simple calculators to handle in fact. To calculate the probability of 15 amputees using our Poisson equation is starting to get difficult.
However, if we use a version of the shortcut equation it's more manageable. We can start with the probability for 10 amputees multiply it by the mean to the fifth power and then divide by the additional terms 11, 12, 13, 14, and 15.
Whichever method we end up using, we get a probability of 0.0582

Slide 14

If we calculate all the Poisson probabilities, we will get the distribution shown.
In reality, for this question, it's probably less useful to know probabilities of exact numbers and more useful to know about the probability of the number of amputees being above a certain amount.
That's the sort of calculation that might lead a medical facility to make sure they have a specialist available for the specific needs of those individuals.

Slide 15

So what is the probability of a town with more than 15 amputees?
Again, we can't some all of the probabilities for 16, 17, 18 etc because the distribution never truly ends.
However, if we have the probabilities for 0 through 15, we can subtract their sum from 1.

Slide 16

Or, if we keep track of the cumulative probabilities this question would be straightforward.
We could look at our tabled column of cumulative probabilities for the one that corresponds to less than 16 and subtract that from 1. that gives us 1 - 0.893451 equals 0.106549 for our probability.
This probability of approximately 10% indicates that for any particular town there is a roughly 10% chance it will have more than 15 amputees.
Another way of thinking about it is that even though the mean number of amputees was 11.25, we expect about 10% of the towns to have more than 15 amputees.
If we see a town with more than 15 amputees, that doesn't automatically mean the town is unsafe or its inhabitants are extra prone to accidents, they may just be one of those 10%

Slide 17

To calculate Poisson probabilities we need a good estimate of the mean number of observations, but sometimes the data we get is not perfect.
What if we had a table of event observation data from a scenario that is due to random process as shown to the right.
How could we figure out the mean?
We can't just calculate the mean from the data because the "five plus events" category could be situations with 5, 6, 7, 8 or more observations and we don't know which.

Slide 18

One possible approach is to use our shortcut equation to figure out the mean. If we have two consecutive probabilities we could rearrange that equation to solve for the mean as shown.
Of course, we need to keep in mind there is sampling error and rounding error, so the best practice would be to calculate this for all the steps and favor the ones with the largest number of observations.

Slide 19

If we use the rearranged shortcut equation to solve for the mean we get slightly different values depending on which pair of consecutive Poisson probabilities we use.
Note that numbers of observations are not probabilities, but their relative occurrences should occur in the same ratio as their probabilities which is what is in the equation.
When we compare the number of times we observe 0 events and 1 event with our equation we get an estimate for the mean number of events of 1.8666.
When we compare the number of times we observe 1 event and 2 events with our equation we get an estimate for the mean number of events of 1.9286.
Our estimates seem to cluster around a value of about 1.9 which would be a good guess for this scenario.

Slide 20

For our last example let's think about trying to determine whether a pattern is random when we have a set of observations.
the question is, what if we had a table of event observation data from a scenario that someone claims is due to a random process. Does the data support that claim?
We can use two approaches to answer this question.
First, we can compare the observed values to the ones predicted from a Poisson distribution using the mean that we calculate from the data.
Second, we can compare the mean and the variance of the observed values because if the distribution is Poisson those two values should be the same.

Slide 21

To take our first approach we can compare the observed values to the ones predicted from a Poisson distribution and 100 observed periods. The 100 comes from the total of the right hand column.
The first thing we need to do is figure out the mean number of events in each observation
To figure out the mean we need to go to our table and interpret what values it is summarizing. There would be 20 zeros, 40 ones, 21 twos, 14 threes, 4 fours , and one 5 in our set of 100 numbers.
The mean of all these values is 1.45
Once we have the mean we can calculate the Poisson probabilities as shown there, and multiply them by the 100 observations to calculate the predicted number of observations we would expect for each number of events.
A quick look suggests that the predicted numbers are similar, but not exactly the same, as the observed numbers.

Slide 22

We can compare the observed values to the ones predicted from a Poisson distribution and 100 observed periods with either a figure or a table.
using either of these methods we can see that the number of times in which no events were observed is less than expected , but the number of times in which one event was seen was higher than expected. likewise we saw three events more often than expected but observed two events less often than expected.
This kind of comparison is informative, but does require keeping track of a lot of different values.

Slide 23

A second approach to answer the same question is a little more simple. we can compare the mean and variance of the observed values.
Again, going to all those observed values we calculate the mean of 1.45 and the variance is 1.300.
For a plus on process the mean and variance should be equal. we test this equality by calculating the coefficient of dispersion which is the ratio of the variance to the mean. in this case the coefficient of dispersion is 1.3 / 1.45 equals 0.897.
A coefficient of dispersion less than one suggests that the distribution is a kind that is referred to as over-dispersed or uniform and would tend to match the images shown at the left.
This channel and playlist has another video which is more about the concepts of the Poisson probability where these terms are described in more detail.
If we look at the distribution of the observed and predicted values we can see that it matches the description of a uniform distribution. The number of events seen in the observations are less variable and more consistently near the mean than we would expect from a truly random Poisson process.
Right now, we're looking at both the distribution and the coefficient of dispersion, but that simple calculation told us what to expect.

Slide 24

This fourth example is a fairly common use of the Poisson equation and distribution - comparing observed values and predicted values to test claims of randomness is a core statistical technique .
We assume a Poisson process and make these comparisons.
If they match, then the distribution is probably Poisson and the process responsible for our observation random.
If they don't match, then the distribution is not Poisson and the process responsible for observations is probably not random.
But there will always be some mismatch due to sampling error and rounding.
The question is how much mismatch is too much, how much indicates nonrandom processes?
To truly test, this we need a technique like the chi squared analysis.

Slide 25

To recap we have two important equations for calculating Poisson probabilities.
We have the definition equation as shown which allows us to calculate a Poisson probability when we know the mean number of events.
And we have a shortcut equation that allows us to calculate our Poisson probability when we know another Poisson probability.
You can see the companion video linked in the description and at the end for more detail about these equations and more about applications of the Poisson probability distribution

Zoom out

I hope you found these examples useful and that they help you calculate Poisson probabilities of your own.

End screen

Feel free to randomly press a button to show your appreciation.

Connect with StatsExamples here

This information is intended for the greater good; please use statistics responsibly.

ABOUT contact privacy credits