Connect with StatsExamples here

TRANSCRIPT OF VIDEO:

Slide 1

Welcome to this introduction to probability. In this video I'll introduce the definitions and foundational equations used to calculate probabilities. The channel has other videos with walkthroughs of specific examples to watch after this one.

Slide 2

First things first, when we are thinking about probability we consider the probabilities of events within sample spaces of possible outcomes.
The technical term event describes the outcome we focus on within a sample space which is the set of all possible outcomes.
The diagram there shows what we mean, the events are typically a subset of all the possible outcomes.
When we write about probabilities, we use the text p(A) to represent the probability of event a. We read this text as, the probability of A.

Slide 3

Sometimes we also consider probabilities of events other than a specific event. For example, the probability of the outcome not being A.
To represent this there are a variety of nomenclatures P with a not symbol and A in the parentheses, P with a tilda A in the parentheses, P with an A prime in parentheses, P with an A with an over bar in the parentheses , and P with an A having an exponent of the letter C (which stands for complement) in the parentheses. I prefer the first of these because it's the most unambiguous, but you'll want to keep in mind that the other four are often used depending on the preferences of the author.

Slide 4

Let's look at an example. Let's think of a sample space which is a bowl of M&Ms of different colors and the event will be what happens when we pick and M&M out of the bowl.
For this example, let's think about a bowl with 20 red, 20 yellow, 30 Brown, and 30 green M&Ms so that there are 100 M&Ms total for our sample space.
Let's think of A as being choosing a red. The probability of that outcome when we choose an M&M randomly will be the 20 outcomes that satisfy the event divided by the size of the sample space which is 100. 20 / 100 gives us a probability of red of 0.2
We can also think about the probability of not red. That is, choosing an M&M with a color different from red. In this case there are 80 outcomes that satisfy that condition so the probability of choosing an M&M that is not red is 80 / 100 equals 0.8.
We can also think about a couple of other situations, impossible events and certain events.
What is the probability of choosing an M&M from this sample space and it is purple? That probability would be 0 divided by 100 equals 0 which represents the probability of an impossible event.
What is the probability of choosing an M&M from this sample space and it is an M&M? That probability would be 100 divided by 100 equals 1 which represents the probability of a certain event

Slide 5

The scenario we just looked at illustrates several basics about probability.
The probability of an impossible event is 0 and the probability of a certain event is one, therefore the probability of any event must always be between zero and one and can include zero or one.
We also saw that the probability of choosing a red M&M plus the probability of choosing a not red M&M gave us values that added up to one which is an example of the complementation rule. This rule states that the probability of A plus the probability of not A equals 1

Slide 6

Going back to our example of the M&Ms we can see how we could have used the complementation rule directly to calculate the probability of not red.
The complementation rule gives us the equation - probability of red plus probability of not red equals 1, which we can rearrange by subtracting the probability of red from both sides to solve for the probability of not red.
If we had already calculated the probability of red to be 0.2, then we could plug that in to find the probability of not red as being equal to 1 - 0.2 quals 0.8.
This example is easy, but this rule allows us to get a difficult probability if the complement is easy. It's often the case that some probabilities are easier to calculate than others, and of course we usually are interested in the one that's harder to calculate, so rules like this are used to obtain those difficult probabilities

Slide 7

The complementation rule we just looked at works because A and not a are mutually exclusive , which means that a single event cannot be both.
For example a single M&M can't be both red and also not red at the same time.
but not all events are mutually exclusive.
for example, let's think about the probabilities of drawing cards from a deck of cards.
In this sample space drawing a card that is a heart and drawing a card that is a club Would be mutually exclusive because there are no cards could satisfy both those outcomes .
On the other hand, drawing a card that is a heart and drawing a card that is a Jack would not be mutually exclusive, because there is a card that would fulfill both outcomes

Slide 8

Let's look at the card example a little bit more and draw some diagrams to illustrate what's happening.
Drawing a heart and a club are mutually exclusive events which we can visualize by looking at the diagram on the left where the 13 clubs fill some of the sample space and the 13 hearts fill some of the sample space but there is no overlap because there are no cards that are both clubs and hearts.
Likewise, drawing a jack and a two are mutually exclusive events which we can visualize by looking at the diagram in the center where the 4 jacks fill some of the sample space and the 4 twos fill some of the sample space, but there is no overlap because there are no cards that are both jacks and twos.
However, drawing a heart and a Jack are not mutually exclusive because when we look at our sample space, the subset of four jacks and the subset of 13 hearts do overlap because there is a card that is both a heart and a Jack.

Slide 9

This leads us to our first set of probability rules, addition rules. These are used when we are trying to calculate the probability of the event being something or something else.
Looking at our diagrams let's think about the probabilities of drawing a card that is one or another of our two outcomes. Let's look at the first two examples.
The probability of drawing a card and it is a clubs or heart would be the 26 cards that satisfy either of those outcomes out of the sample space of 52, giving us a probability of 0.5.
The probability of drawing a card and it is a Jack or a 2 would be the 8 cards satisfy that either of those outcomes out of the sample space of 52, giving us a probability of 0.154
Instead of using diagrams like these we can instead use the special addition rule which works for mutually exclusive cases.
If A & B are mutually exclusive events, then the probability of A or B is equal to the probability of A plus the probability of B.
For our clubs and hearts example, the probability of clubs or hearts would be equal to 13 out of 52 (the probability of a club), plus 13 out of 52 (the probability of a heart), equals 26 out of 52 equals 0.5
For our jacks and twos example, the probability of jacks or twos would be equal to 4 out of 52 (the probability of a jack), plus 4 out of 52 (the probability of a two), equals 8 out of 52 equals 0.154
It's a little more complicated for our third example when the events are not mutually exclusive.

Slide 10

Now let's think about a scenario in which our events are not mutually exclusive, the overlapping outcomes in the sample space makes things a little more complicated.
For the jacks and hearts example, the probability of jacks and hearts would be equal to 4 out of 52 (the probability of a jack), plus 13 out of 52 (the probability of a hearts), but we can't add them because we counted the card in the overlap area twice. We would need to subtract the outcomes in the overlap.
This gives us our second addition rule which is the general edition rule. If A & B are not mutually exclusive events then the probability of A or B is equal to the probability of A plus the probability of B, minus the probability of A and B.
For our jacks and hearts example, the probability of jacks or hearts would be equal to 4 out of 52 (the probability of a jack), plus 13 out of 52 (the probability of a heart), minus the probability of drawing the jack of hearts which is 1 out of 52. This gives us 4/52 + 13/52 - 1/52 which equals 16 out of 52 equals 0.308.

Slide 11

To put it all together we really have a single addition rule called the general addition rule which states that the probability of A or B is equal to the probability of a plus the probability B minus the probability of A&B.
But if A&B are mutually exclusive then the probability of A&B is 0 so the last part of that equation disappears and it simplifies to the special addition rule that the probability of A or B is equal to the probability of A plus the probability of B.
So how is this useful?
First, we can use these equations to solve for difficult probabilities when we have the others. They are mathematical equations with three or four values so if there was one we wanted to know, and we could get all the other ones, we could just solve for the one we want.
Second, we can determine if events are mutually exclusive by separately measuring and then comparing p(A or B), p(A), and p(B) to see if the second equation works or not.
If it works then we know the events are mutually exclusive, but if it does not work then we know they are not mutually exclusive.
This isn't useful for an example like playing cards where it's easy to understand the sample space and everything is obvious, But there are lots of other situations when we are studying new phenomena where we may be able to measure these probabilities before we understand how all the outcomes and events in the sample space really work. Determining if certain events are mutually exclusive, or not, can be valuable information when studying a novel situation.

Slide 12

So to quickly recap. Probability looks at events inside of a sample space. the probability of an impossible event is zero and the probability of a certain event is one therefore for any event a the probability of a is between zero and one, inclusive.
we have a complementation rule that tells us that the probability of a plus the probability of not a equals 1 .
And we have a pair of addition rules that tell us the probability of A or B is equal to the probability of A plus the probability B minus the probability of A&B. This simplifies down into probability of A or B equals probability of A plus probability of B if A&B are mutually exclusive .
these last two rules are for figuring out the probability of A or B what about an equation for the probability of A&B?

Slide 13

For figuring out the probability of A&B we will use a multiplication rule .
the probability of A&B equals probability of a multiplied by the probability of B given a .
That second term with the vertical line between B&A in the parentheses indicates what's called a conditional probability. B vertical line A represents us considering event B when we know that event A has occurred. You can think about the whole term as the probability of B when a is true. And it is usually read aloud as probability of B given A.
Let's think about 3 examples of probability of B given A using our deck of cards.
If event A is club and event B is heart what is the probability of B given A? This would be the probability of the card being a heart if we already knew it was a club so would therefore be 0.
If event A is jack and event B is heart what is the probability of B given A? This would be the probability of the card being a heart if we already knew it was a jack so would therefore be 1/4.
If event A is heart and event B is jack what is the probability of B given A? This would be the probability of the card being a jack if we already knew it was a heart so would therefore be 1/13.

Slide 14

Let's see an example of using the multiplication rule.
Our rule is the probability of A&B is probability of a times probability of B given a.
Let's think about drawing a card that is a Jack and a heart so event A will be Jack an event B will be hearts.
Our rule therefore gives us the result that the probability of drawing a Jack and hearts would be 1 out of 52 because that's the probability of drawing a Jack, 4/52, multiplied by one 1/4 because that's the probability of drawing a heart if we already knew we had drawn one of the four jacks. Multiplying those together gives us one out of 52 which matches what we know the answer should be because there is only one Jack of hearts in the deck.
If we think about it though for this example the probability of B given A equals the probability of B itself .
The probability of B given A was the probability of hearts given Jack and was equal to 1/4, but that's the same as the probability of B which is equal to the probability of hearts which is equal to 13 out of 52 which is 1/4.
When we have a situation like this where the probability of B equals probability of B given A we say that the events A&B are independent.

Slide 15

The property of independence allows us to simplify our multiplication rule in a similar manner as we simplified our addition rule.
the general multiplication rule is the probability of A&B equals probability of a multiplied by probability of B given A.
this simplifies to the special multiplication rule where the probability of A&B equals the probability of a multiplied by the probability of B when events A&B are independent.
The special multiplication rule is fairly simple and allows us to calculate probabilities fairly easily when events A&B are independent
However events A&B are not always independent. when this happens we have to use conditional probabilities and we would usually use a probability tree diagram to calculate the probabilities.

Slide 16

Let's look at a scenario in which events A&B are not independent. Consider the sample space of 100 animals in the box in which each is female or male and each is a cat or dog.
Now let's think about the probability of choosing an animal and it is male and a cat.
Looking at the sample space directly we can see that there are 40 male cats out of the 100 animals so our result should be 40 / 100 is 0.4.
However we can't just multiply the probability of male times the probability of cat as shown because this would give us 60 / 100 times 50 / 100 equals 0.6 times 0.5 equals 0.3 for our answer. The answer is incorrect because the sex of the animal and the species of the animal are not independent.
Instead, if we only had the probabilities to work with we should construct a probability tree as shown.
We depict the set of possible outcomes for the first event on the left and draw two branches, one for each of the possible outcomes. Then for each of those outcomes we draw two branches for the possible outcomes of the second event.
It's here we can see the non-independence. If the animal is a female the probability of it being a cat is 10 out of 40 which is 0.25 but the probability of and the probability of it being a dog is 0.75. But if the animal is a male the probability of it being a cat is 40 out of 60 which is 0.66 and the probability of it being a dog is 0.33. the probability of the chosen animal being a dog or a cat depends on its sex.
using this diagram we start on the left and we take the bottom branch with the probability of 0.6 because 60 out of 100 of the animals are male. then for the second set of branches we take the upper branch representing cat for the species and that gives us a probability of 0.66. to get our final probability we multiply, zero point 6 times 0.66 equals 0.4 which was the correct answer.

Slide 17

To put it all together we really have a single multiplication rule called the general multiplication rule which states that the probability of A and B is equal to the probability of A multiplied by the probability B given A.
But if A&B are independent then the probability of B given A is equal to the probability of B so the last part of that equation simplifies to the special multiplication rule that the probability of A and B is equal to the probability of A times the probability of B.
So how is this useful?
As we saw with the addition rules earlier, we can use these equations to solve for difficult probabilities when we have the others. If there is ap articular probability we wish to know, we could get all the other ones and solve for the one we want.
Second, we can determine if events are independent by separately measuring and then comparing p(A and B), p(A), and p(B) to see if the second equation works or not.
If it works then we know the events are independent, but if it does not work then we know they are not independent.
This isn't useful for an example like playing cards where it's easy to understand the sample space and everything is obvious, But there are lots of other situations when we are studying new phenomena where we may be able to measure these probabilities before we understand how all the outcomes and events in the sample space really work.
For example what if our sample space was all humans and we wanted to know whether the probability of having bladder cancer was independent of the sex of the individual? We can't possibly measure the entire sample space, but we could measure those three probabilities separately. We could then see whether they work in the special multiplication rule, which would tell us that risk of bladder cancer does not depend on sex, or they wouldn't work, in which case we would know that risk of bladder cancer does depend on sex.

Slide 18

A final relationship that is used in some probability calculation is called bayes theorem. and we can see where this comes from with a little algebra .
First, probability of A&B equals probability of B&A.
Second, By the general multiplication rule probability of A&B equals probability of a times probability of be given a. similarly , by the multiplication rule probability of B&A equals probability of a times probability of a given B.
Therefore, if we plug the equations from the second and third lines into the first line we get probability of A times probability of B given A equals probability B times probability of A given B.
Dividing both sides by probability of B gives us Bayes theorem as shown.
Probability of a given B equals probability of a times probability of B given a divided by probability of B.

Slide 19

What is Bayes theorem used for? We can use this equation to solve for conditional probabilities which are often hard to do. but conditional probabilities are often very interesting and reveal surprising results.
For example imagine a sample space of people where some have a disease and some do not, and we give them a medical test and some return positive results and some return negative results.
For our disease event we will consider a rare disease where there is a 2% probability that a randomly chosen individual has the disease. Having the disease will be considered event A.
For the testing event we will consider a test that is mostly accurate, but 1% of the time gives the wrong result. A positive test result will be considered event B.
An interesting question to think about it is what is the probability that a random person who tests positive has the disease , In other words probability of A given B.
Using bayes theorem probability of a given B is probability of a times probability be given a divided by probability of B/
For the numerator probability of a was 2% and the probability of getting a positive test result if you were infected is 0.99 because the test has a 1% false negative rate.
For the denominator we need to think about two groups of people that get positive tests. The first set is the 2% of the population that has the disease multiplied by the 99% accuracy of the test. The second set is the 98% of the population that does not have the disease multiplied by the 1% false positive rate.
Plugging all those numbers in and doing the calculations gives us a value of 0.67 for the probability of A given B.
let's think about what that means. There is a disease that is only present in 2% of the population, but if we randomly test people then the ones who get a positive test result only really have a 67% chance of having disease. 1/3 of the people getting positive test results are not sick even though our test has 99% accuracy.
This is an unintuitive result, but it's easy to calculate using Bayes theorem.

Slide 20

This slide pretty much sums up the basics of probability.
Probabilities are always between 0 for impossible and one for certain. we can use addition rules to calculate the probability of A or B and which one we use depends on whether they are mutually exclusive . we can use multiplication rules to calculate the probability of A&B and which one we use depends on whether they are independent.
Probabilities are generally fairly straightforward when the events are independent but when they are not independent we have to use probability trees or Bayes theorem to take into account the conditional probabilities.

Zoom out

I hope you found this useful.
The best approach to really understanding probability is to do practice questions however, not just memorize rules.
The channel and website have additional examples you can look at to get some practice so you can really increase your chances of getting probability questions correct.

End screen

Odds are, if you liked this video, you'll find the suggested channel and video equally useful.

Connect with StatsExamples here

This information is intended for the greater good; please use statistics responsibly.