TYPE I AND II ERRORS

Tweet
Share

Watch this space

Detailed text explanation coming soon. In the meantime, enjoy our video.

The text below is a transcript of the video.


Connect with StatsExamples here


LINK TO SUMMARY SLIDES FOR VIDEO:


StatsExamples-type-I-and-II-errors.pdf

TRANSCRIPT OF VIDEO:


Slide 1

Type 1 and type 2 errors describe when we get the wrong conclusions from our statistical analysis, but they are not mistakes. To see why this is, let's look at what type 1 and type 2 errors are.

Slide 2

When we do statistics we typically use a sample to make an educated guess about the population. That's generally because we can't measure the entire population we want to know about, but we can measure a sample which we hope accurately represents the population we care about.
but sampling error, that is noise an randomness, means that our samples are sometimes misleading. Just because of chance, sometimes the sample data does not provide a good snapshot of the population.
This is statistical error due to randomness. The word "error" isn't saying we did a bad job, it's saying that the sample doesn't always match the population because of randomness
Again, this is not a mistake on our part. This kind of error is not a mistake unless we don't realize it might be happening and assume that our samples always give us the right answer with perfection. That's a mistake.

Slide 3

Let's think about how this comes into play when we do statistical tests.
When we do tests we generally use a null and alternative hypothesis.
And the reality of the situation is that either the null hypothesis, H0, or the alternative hypothesis, HA, is true.
We don't know for sure which is true, that's why we're doing a statistical test.
We then do our statistical test and get a decision or a conclusion, not an absolute proof.
Our decision will be that either the HO or HA is true and we always hope that our decision matches reality.
When we do statistics, we can avoid mistakes by following good statistical procedure.
We can avoid mistakes like a badly designed hypothesis about apples where we're not measuring anything quantitative and are instead making statements about opinions. That kind of mistake can be avoided by making sure we're working with objective metrics.
We can avoid mathematical mistakes by using a computer or checking our work carefully. We can also avoid these types of mistakes by looking at our final results and seeing if the numbers make sense.
Even when we avoid mistakes however, sampling error may still result in samples that causes us to make the wrong decision.
Because of this it's worth it to think about what kinds of wrong decisions we can make even when we do all the math correctly.

Slide 4

Broadly speaking there are two types of errors we can make when we're analyzing our statistical data. these arise from when our decision does not match up with reality.
In reality either the null hypothesis or alternative hypothesis is true .
We then use our sample data to make a decision that either HO or HA is true by accepting or rejecting the null hypothesis.
Strictly speaking we never reject the null hypothesis, we fail to accept the null hypothesis. As long as we keep this in mind we can use the commonly used shorthand terminology of "rejecting the null hypothesis".
The figure shows the four possible outcomes.
If the null hypothesis is true and we accept it that's the correct result and we haven't made an error.
If the null hypothesis is false and we reject it then we haven't made an error.
If the null hypothesis is true and we failed to accept it, that is reject it, then we term this a type 1 error. Our decision doesn't match reality because we think the alternative hypothesis is true when in fact the null hypothesis is true.
If the null hypothesis is false and we accept it then we term this a type 2 error. Our decision doesn't match reality because we think the null hypothesis is true when in fact the null hypothesis is false and it's the alternative hypothesis that's true.
Both of these errors describe how our sample may mislead us about the population because of sampling error.
Exactly what these errors mean in the real world depends on what our null and alternative hypotheses are.

Slide 5

The challenging thing about doing statistical tests is that we cannot eliminate both types of error. There will always be at least some risk that sampling error will result in a sample that does not accurately reflect the population.
We can shift our requirements for making our decision however. we can specify how easy or difficult it should be to accept or reject a null hypothesis. Typically this is done by specifying which p value we use as a threshold to reject the null hypothesis.
But this is a tradeoff reducing the risk of type 1 error, by requiring a smaller p value for example, increases the risk of type 2 error and vice versa.
So far, this whole scenario may seem kind of foreign and abstract, but you have already been exposed to these exact same ideas in a different context.

Slide 6

The justice system in the United States uses the exact same logic we've been looking at for statistical tests.
The null hypothesis in a court case is that the person is innocent.
The alternative hypothesis is that the person is not innocent, guilty.
And court cases are attempts to make a decision about which hypothesis we think is true but we know that sometimes the decision does not match the reality.
In this situation a type 1 error is convicting an innocent person Because the null hypothesis of innocence is true, but we reject it and decide to accept the alternative hypothesis that they are guilty
A type 2 error is when we let a guilty person go unpunished, because even though the null hypothesis of innocents is false, we end up accepting it.
This kind of awkward logic of deciding if we have evidence to reject a null hypothesis, even when we're doing this whole procedure because we think the null hypothesis probably isn't true, is something we're very familiar with.

Slide 7

The analogy with the justice system even extends to the tradeoff between type one and type 2 error.
By making it harder or easier to convict people we are shifting the relative risk of making a type one or type 2 error.
And we can see how this trade off is considered in the real world based on the consequences of making a type 1 or type 2 error in the justice system.
In criminal cases the jury is instructed to only reject the null hypothesis if they can make their conclusion "beyond a reasonable doubt". That standard makes it very hard to reject the null hypothesis which is equivalent to a statistical test requiring the P value to be extremely small in order to reject the null hypothesis.
This strict requirement is in place because in criminal cases people can have their freedom taken away or even be executed so the consequences of a type 1 error, punishing an innocent person, are severe.
On the other hand in civil cases the jury is instructed that they may reject the null hypothesis if they can make their conclusion "According to a preponderance of the evidence". That standard makes it easier to reject the null hypothesis which is equivalent to a statistical task requiring the P value to be small, but not super small.
This less strict requirement is in place because in civil cases people can only be fined monetary amounts, they cannot be jailed or executed so the consequences of a type 1 error are not as irreversible. But by having a less strict requirement, civil cases allow Fewer truly guilty people to get away without facing any consequences at all.

Slide 8

There is no single perfect threshold for accepting or rejecting the null hypothesis in all criminal and civil cases.
Both type one and type 2 errors are miscarriages of justice, but we cannot eliminate all of them.
In fact, the relative importance and degree we are concerned about each of these types of error is one of the differences between conservatives and liberals in terms of their political philosophy.
Pretty much everyone agrees that type 1 errors are much worse than type 2 errors. But the only way to completely eliminate type 1 errors would be to make it almost impossible to convict anyone without perfect evidence, which would result in tons of type 2 errors
There are therefore valid differences of opinion when we think about how much type 2 error we are willing to tolerate in order to minimize type 1 error.
Again, while most people agree that type 1 errors should be as rare as possible, conservatives tend to be more concerned about type 2 error than liberals and are therefore willing to tolerate higher type 1 errors in order to reduce type 2 errors.
I'm not saying that either conservatives or liberals are right or wrong, what I'm saying is that statistical concepts penetrate deeper into society than just a few classes that people take at college.
OK, statistical error is inevitable but when we know what type of errors we can't help but make, then we can be more cautious in our interpretations.
Note that type one errors are also called false positives sometimes and type 2 errors are called false negatives sometimes - especially when we think about medical tests.
This is because when we think about test results, the null hypothesis is usually that a person is negative for the thing being tested.

Slide 9

All of this thinking about type one and two type 2 errors is so that we can interpret test results the right way.
The wrong way to interpret a statistical test is that the small p value indicates the null hypothesis can be rejected or that a large p value indicates that the null hypothesis should be accepted and that's THE ANSWER.
Statistics is not proof it is decision making based on probabilities. and sampling error means that sometimes the improbable occurs, but we have no way of recognizing that when it does.
the right way to interpret a statistical test is that the small p value indicates the null hypothesis is probably wrong but we may be making a type 1 error or the large p value indicates that the null hypothesis is probably true but we may be making a type 2 error.

Zoom out

We can still come to conclusions and move forward, but we should always keep in mind that weird things sometimes happen, and we therefore need to be aware of the possibility of type 1 or type 2 errors when we do our studies.

End screen

If you found this useful or interesting, click subscribe to stay connected for more videos or to find this channel again in the future.


Connect with StatsExamples here


This information is intended for the greater good; please use statistics responsibly.