PROBABILITY (EXAMPLES)

Tweet
Share



Connect with StatsExamples here



LINK TO SUMMARY SLIDE FOR VIDEO:


StatsExamples-probability-examples.pdf

TRANSCRIPT OF VIDEO:


Slide 1

Let's do some step by step examples of probability calculations.

Slide 2

First, a quick review of the basics of probability.
When we calculate probabilities we are thinking of the chances of outcomes in a sample space meeting the criteria for the event we are interested in.
The probability of an impossible event is zero and the probability of a certain event is one and all probabilities are therefore between zero and one.
The probability of an event plus the probability of not that event is equal to 1.
The probability of A or B equals the probability of A plus the probability Of B minus the probability of A&B.
This simplifies into the probability of A or B equals the probability of A plus the probability of B if the events A&B are mutually exclusive. That is, there are no events that are both A&B.
The probability of A&B equals the probability of a times the probability of B given a. The probability of B given A is the probability of B under circumstances in which A has occurred.
This simplifies into the probability of A and B equals the probability of A times the probability of B if the events A&B are independent. That is, whether A occurs or not, the probability of B would be the same.
When events are not independent, we have to keep track of the different probabilities of each event so we typically use a probability tree as shown in the top right.

Slide 3

OK, let's look at our first data set. We'll be thinking about a sample space of 100 snakes as represented by the boxes. Of the 140 will be cobras and 60 will be pythons. The events we are thinking about are choosing individuals out of the sample space and we will be thinking about which species of snake gets chosen.
For our certain and impossible events, the probability of choosing an individual and it is a snake is certain so that probability is one - whereas the probability of choosing an individual and it is not a snake is impossible so that probability is zero.
Now let's think about the probability of choosing a cobra. That probability would be the number of individuals that satisfy our condition, 40, divided by the total number of individuals in the sample space, the 40 cobras plus the 60 pythons which equals 100 snakes. Making that division gives us a probability of 0.4
Similarly, the probability of choosing a cobra would be 60, divided by 100 to give us a probability of 0.6
The probability of choosing a snake and it is a cobra or python will be the probability of choosing a cobra plus the probability of choosing a python because these events are mutually exclusive, snakes are one or the other, never both. This gives us 0.4 plus 0.6 equals 1 which makes sense because in this sample space we are guaranteed to get either a cobra or a python when we choose a snake.

Slide 4

What about situations when we choose two or more snakes. To figure out the probability of the first and 2nd this would be equal to the probability of the first times the probability of the second if those events are independent.
To figure out whether those events are likely to be independent we need to think about what kind of sampling we are doing
When we do sampling with replacement we take an individual, assess it, and replace it back into the population before we take the second individual. In this way the sample space is the same each time. This is an idealistic situation and not usually how things work in the real world, but it is mathematically easier.
A more realistic situation is when we take a sample of multiple individuals from a population all at once. But when we do this, each one we take changes the population from which we are taking the next individual because the population would be missing the previous individual.
Luckily , calculating probabilities as if we are doing sampling with replacement is accurate if the population is big enough and our samples are small enough.
however, when we want to make sure we are taking into account the sample space changing then we are doing what is called sampling without replacement because the sample space changes.
This is a more realistic situation but is mathematically harder .
We'll do some example calculations both ways so we can see the difference

Slide 5

Let's think about a situation in which we choose two snakes from the population and we're interested in the probability that they are both cobras.
For sampling with replacement the sample space, or population, is identical each time so our probability of cobra and cobra is equal to the probability of cobra times the probability of cobra which is 0.4 times 0.4 equals 0.16
For sampling without replacement the sample space, or population, changes lightly between the first and second choice.
For the first choice the probability of cobra is the 40 cobras out of the 100 snakes which is 0.4.
But for the second choice, the sample space now has 39 cobras and an overall size of 99 snakes. So the probability of choosing a cobra is the 39 cobras out of the 99 snakes which is 0.3939.
Now our probability of cobra and cobra is equal to the probability of the first cobra times the probability of the second cobra which is 0.4 times 0.3939 equals 0.15758.

Slide 6

As we saw, the probability of choosing two cobras was 0.16. Another way to think about this was that we chose a cobra then a cobra.
Similarly, we can think about the probability of choosing a cobra then a python. Assuming sampling with replacement, this would be 0.24.
I bring up this example because the probability of choosing two snakes and getting a cobra and then a python is different from the probability of cobra and python.
If we want to calculate the probability of cobra and python, we need to think about the two different ways this can occur. It could be cobra then python or python then cobra. What we're really thinking about is the probability of cobra then python, or python then cobra, so we have to calculate those two probabilities and add them.

Slide 7

As always, things are a little more complicated if we do sampling without replacement . In this case we should use a probability tree.
To calculate the probability of choosing two cobras we go to the left side of the tree and follow the branch for the probability of choosing a cobra which gives us 40 / 100 is 0.4. then our second branch we follow the probability of choosing cobra which gives us 39 / 99 which is 0.39394. multiplying those two values gives us our probability of choosing a cobra and a cobra of 0.15758
To calculate the probability of choosing a cobra and a Python we need to work our way through the probability tree twice.
First, let's think about cobra then Python. Starting at the left of the tree we have 40 / 100 for the probability of cobra. For our second branch we have the probability of Python is 60 / 99. Multiplying those two values gives us 0.24242
Second, let's think about python then cobra. Starting at the left of the tree we have 60 / 100 for the probability of python. For our second branch we have the probability of cobra is 40 / 99. Multiplying those two values gives us 0.24242.
Therefore, the overall probability of choosing a cobra and a Python is 0.24242 + 0.24242 equals 0.48484
In this example these two probabilities were the same, so we just doubled one of them. But this isn't always going to work when using probability trees so don't use that as an automatic shortcut. You should always follow the route of every combination you're interested in when doing problems like this

Slide 8

We can also sometimes use the complementation rules to answer the questions we're interested in.
For example the probability of choosing a cobra and a Python is there probability of not choosing either 2 cobras or two pythons. because of this we could subtract those probabilities from one to get the probability of choosing a cobra or a Python.
In this case, if we sampled with replacement the probability of choosing a cobra and a python would be 1 minus the probability of two cobras (which is 0.4 squared equals 0.16) minus the probability of two pythons (which is 0.6 squared equals 0.36). This would give us a value of 0.48.
Alternately, if we sampled without replacement the probability of choosing a cobra and a python would be 1 minus the probability of two cobras (which is 15758) minus the probability of two pythons (which is 0.35758) which is 0.48484.
You can see that these two probabilities are similar, but they're different because of the different type of sampling being used.

Slide 9

To recap. We have a sample space of 100 snakes where the probability of choosing a cobra is 0.4 and the probability of choosing a Python is 0.6 .
The probability of choosing a snake and it is a cobra or Python is 0.4 + 0.6 equals 1
The probability of choosing two snakes and they are both cobras depends on whether you sample with or without replacement but is approximately 16%
The probability of choosing two snakes and they're both pythons depends on whether you sample with or without replacement but is approximately 36%
And the probability of choosing two snakes and they are different also depends on whether you sample with or without replacement but is approximately 48%

Slide 10

The size of the sample space can make a big difference in terms of how much the type of sampling influences the probabilities we obtain.
Let's think about the probability of choosing two cobras , two pythons, or two snakes that are different.
If we sample with replacement those probabilities are 0.16, 0.36, and 0.48 as shown on the left in blue
If we calculate those probabilities without replacement using the sample space we've been working with we get the probabilities shown in the center column of the three columns in red.
If our overall sample space is smaller, but the relative numbers of the snakes are the same, that would give us the probabilities in the red column on the furthest left. You can see that they can be quite different from the values we obtained using sampling with replacement
If our overall sample space is larger , but the relative numbers of the snakes are the same, that would give us the probabilities in the red column on the furthest right. Now you can see that those values are very close to the values we obtained using sampling with replacement.
Once our sample space gets large enough, then using the simpler, sampling with replacement, equations can be very accurate.

Slide 11

Let's look at some similar probability calculations using a second data set, a sample space of 200 birds. The sample space has 20 ducks, 80 gulls, 40 eagles, and 60 hawks.
Again, we will choose individuals from the sample space , and the event will be which type of individual gets chosen.
If we sample with replacement, the probability of a duck would be 20 out of the total sample space of 200 equals 0.1 and the probability of an eagle would be 40 out of the total sample space of 200 equals 0.2.
The probability of choosing an individual and is as a duck or an eagle would be the sum of the individual probabilities which would be 0.1 + 0.2 equals 0.3.
The probability of choosing two birds and we obtain a duck and an eagle would be the probability of choosing a duck, then an eagle, which is 0.1 times 0.2 - plus the probability of choosing an eagle, then a duck, which would be 0.2 times 0.1. That would be 2 times 0.02 equals 0.04
The probability of choosing four birds and they are all gulls would be the probability of choosing a gull which is 80 out of 200 equals 0.4 for the first individual, times 0.4 for the second, times 0.4 for the third, times 0.4 for the fourth, which equals 0.4 raised to the fourth power which equals 0.0256.
While the probability of duck or eagle doesn't depend on what type of sampling we use, the last two probabilities will be different if we sample without replacement.

Slide 12

Let's look at the probability of choosing a duck and an eagle when we use sampling without replacement. For this we will use our probability tree.
There's two different ways this can occur, duck and then eagle or eagle and then duck so we will have to go through our probability tree twice.
First option. Starting on the left we look at the probability of duck and that's 20 out of the sample space of 200. Following to the second branch now we will follow eagle which will occur with a probability of 40 out of the new smaller sample space of 199. If we multiply those two probabilities, we get 0.020101.
Second option. Starting on the left we look at the probability of eagle and that's 40 out of the sample space of 200. Following to the second branch now we will follow duck which will occur with the probability of 20 out of the new smaller sample space of 199. If we multiply those two probabilities, together we get 0.020101
Our overall probability of getting a duck and an eagle will be the sum of these two probabilities which is 0.0402.
One thing to note about our probability tree is that the probabilities shown there don't add up to one like they did for the first example in this video. That's because in the first example all possible pairs of outcomes were shown, but in this example we only focused on a subset of the entire set of possibilities

Slide 13

Let's look at the probability of choosing four gulls when we use sampling without replacement and our probability tree.
We start on the left and the probability of choosing a gull from the initial sample space is the 80 gulls out of 200 birds. Moving to the second branch now there are 79 gulls remaining in a sample space of 199. Moving on to the third branch now there are 78 gulls out of the sample space of 198. Finally, for the fourth branch there are 77 gulls out of the sample space of 197. Those are the four probabilities that we multiply in order to obtain the overall probability of choosing four gulls.

Slide 14

To recap. We have a sample space of 200 birds where the probability of choosing a duck is 0.1 and the probability of choosing an eagle is 0.2.
The probability of choosing a duck or an eagle 0.3, but the probability of choosing a duck and an eagle depends on whether we are sampling with or without replacement.

Slide 15

Just as we did with the snake example, we can look at how the sample size of the sample space is related to the mismatch between our probabilities For choosing a duck and an eagle or for choosing four gulls .
In blue on the left are our probabilities for sampling with replacement and the other three columns show the probabilities if we sampled without replacement. The center column of those three is the sample space we just looked at. To the left is a smaller sample space and to the right is a larger sample space, but with relative numbers of birds being the same in both.
Again, we can see that for small sample spaces there can be a large mismatch between probabilities calculated using sampling with or without replacement, but when the sample spaces get very large those mismatches get very small.

Slide 16

To sum it all up, this is our approach when calculating basic probabilities.
The probability of A or B equals the probability of a plus the probability of B and that doesn't depend on what type of sampling we're doing.
The probability of A&B equals the probability of a times the probability of B when A&B are independent and we sample with replacement , or when the population is very large. Otherwise the probability of A&B equals the probability of a times the probability of B given a and we generally have to use a probability tree to calculate our overall probabilities.

Zoom out

I hope you found these two examples useful. Probability calculations can become easy once you've practiced and done examples on your own, but it takes a little bit of time to get there.
This channel and the website have other videos and examples that can help you learn more about probability and statistics.

End screen

If you found this video useful, click to like the video or click to subscribe. You know you probably want to.


Connect with StatsExamples here


This information is intended for the greater good; please use statistics responsibly.