Watch this space

Detailed text explanation coming soon. In the meantime, enjoy our video.

The text below is a transcript of the video.

Connect with StatsExamples here




Slide 1.

The simplest way to test whether two variances are equal is to test their ratio using a variance ratio F test. Let's take a look at how exactly that works.

Slide 2.

Our basic question is that we want to know if population variances differ from one another. We can't measure the populations directly because this is usually impossible. We therefore take random samples from the two populations and we calculate their sample variances.
The sample variances are our estimates of the population variances, but sampling error makes them inexact. Even if we have good samples, the chances that their variances are exactly the same as the variances of the populations they came from is practically 0.
However, we can test whether our samples provide enough evidence to decide whether the population variances are probably different, or may well be the same.
We do this by asking the following question: What are the chances that the population variances are the same, that's our null hypothesis, based on how much the sample variances differ from one another.
As illustrated by the diagram, we take a sample from each population and calculate the sample variances to try to figure out if the population variances look the same or seem to differ.

Slide 3.

To perform our test, we're going to use the F distribution.
On average, the variances of two samples from the same population, or a sample from each of two different populations with the same variance, should be equal.
As mentioned, sampling error causes them to differ even if the population variances are the same. We will measure that difference with a ratio of the sample variances and compare that to a probability distribution.
The probability distribution of the ratios of sample variances from populations with the same variance is the F distribution.
So what we will do is calculate 2 ratios, F values, using each sample variance divided by the other.
If the population variances are the same, we would expect these values to be close to one, but if the population variances are different one of our F values would be very small and the other very large.
Just like we do for other statistical tests, we compare the observed test statistic, that's our ratio, to the probability distribution for when the null hypothesis is true.
We do this to see if our data is what we would expect if the population variances were the same, or if it is something we would very rarely see if the population variances are the same.

Slide 4.

It turns out that there is a different F distribution for every combination of degrees of freedom. Degrees of freedom is a measure of our overall sample size. The two degrees of freedom values for our F test will be the sample size minus one for each sample we take.
You can see here a couple of different factors that influence the width of the F distribution. The width indicates how much deviation from one we would expect our ratio to have if the null hypothesis of equal population variances is true.
The two left-hand figures indicate the same overall sample size, as measured by degrees of freedom, but we can see that when the samples are unequal in size there is more variation in our F value due to sampling error.
This is because the smaller sample will have more sampling error and cause more variation in the F values we expect to see even when the population variances are the same.
The two right-hand figures show how the total sample size influences the expected F values. The relative degrees of freedom are the same for both, one sample has twice the degrees of freedom as the other, but the top figure has almost four times as much data.
We can see how the effects of sampling error are much more pronounced in the bottom figure where there is less data. For small samples, we expect to see a wide range of F values based on sample data even when the population variances are the same.

Slide 5.

Let's look in more detail at one example, the F distribution for 8 degrees of freedom and four degrees of freedom.
This figure shows the probability distribution and we can see that most of the F values will be close to 1.
► 95% of the time, the F value we would get from 2 samples from populations with the same variance, will be in the middle 95% of this probability distribution.
► If we wanted to figure out critical F values, the threshold F values that are far enough away from one that they would be very unlikely, we can choose an alpha value for the area under the curve at each end.
If we chose an alpha value of 0.025, so that we would have 2.5% at each end and 95% in the middle, those F values would be 0.198 and 8.980.
In the same way that we can make critical value tables for our T distribution, or any other probability distribution used in a statistical test, we could make tables with these values at both ends.
► Then our task is just to see if the F calculated value is small enough or large enough to be in one of these alpha equals 0.025 regions.
► Or, for an overall alpha of 0.05, we use just the top alpha equals 0.025 region by using a test statistic where we always divide the larger variance by the smaller variance.
This calculation would invert all the very small F calculated values and essentially flip them over to the right side of this distribution.

Slide 6.

OK, let's take a look at the formal procedure for doing a variance ratio F test.
First, as we always do for every statistical test, we create null and alternative hypotheses.
The null hypothesis is that the population variances are equal, the variance of population one equals the variance of population 2.
The alternative hypothesis is that the population variances are not equal, the variance of population one is not equal to the variance of population 2.
► The next step is to create our test statistic, the F calculated ratio.
For this we calculate our two sample variances and divide the larger variance by the smaller variance.
► Then we compare our F calculated value to various F critical values from the F distribution.
As mentioned, there is a different F distribution for each pair of degrees of freedom values, so the tables we use will typically show these.
In the F table shown, available on the StatsExamples website, the columns indicate degrees of freedom in the numerator and the rows indicate degrees of freedom in the denominator. The values within the table are the critical values that correspond to an Alpha value of 0.025.
In other words, how far out on that X axis do we need to go so that 2.5% of the area under the F distribution curve is to the right.
This is equivalent to how far away from one would we need to go to be outside of the middle 95% of the F distribution if we calculated both ratios.
Keep in mind that our actual P-value is twice the value in the table.

► By looking at a variety of tables, each corresponding to a different alpha value, we can determine the smallest alpha value that we would be able to use and still reject the null hypothesis. This is because our F calculated value is larger than the F critical value.
That minimum alpha value is our probability, the P-value, of seeing an F calculated value as large as we do.
If we have access to a computer, the exact P-value can be calculated directly.
► We then decide whether to "reject the null hypothesis" or "fail to reject the null hypothesis" based on the P-value.
The null hypothesis, the variance of population one is equal to variance of population two, is consistent with most P-values except for very small ones.
The alternative hypothesis, that the variance of population one is not equal to the variance of population two, would give us small P-values.
Typically, when our P-value is less than 0.05 we reject the null hypothesis and conclude that the population variances are different.
When our P-value is not less than 0.05 we fail to reject the null hypothesis. This decision indicates that we lack the evidence to conclude the population variances are different, so we would generally assume that they are the same.
Keep in mind, as with all statistical tests, we are not proving the null or alternative hypothesis, we are making a decision about which one is likely based on the probability of seeing the test statistic we did. We should always keep in mind that we may be making a type one or type two error.

Slide 7.

This is just a reminder that the variance ratio F test is a two tailed test even though we are only looking at one tail of the probability distribution in our statistical tables.
That's because, instead of looking at both ends of the distribution, we rearranged the equation to force our F ratio into the higher end of the distribution.

Slide 8.

Sometimes the F test is one-tailed, for example in the ANOVA technique.
This is when our question is whether the variance of our focal population is larger than the variance of the other population.
In this case we would always put the sample variance for the focal population in the numerator and the sample variance for the other population in the denominator, regardless of which is bigger.
Then we compare the F calculated value to the right end of the F distribution, but we would use the table for alpha equals 0.05 directly for an overall P-value of alpha equals 0.05.

Slide 9.

Keep in mind that the F test only tests the variances, it doesn't compare the means or shapes of the distributions.
The two left-hand figures would both give the exact same large P-value in an F test because, even though the means may or may not be very different, the variances are very similar and that's what's being tested.
The two right-hand figures would both give the exact same small P-value in an F test, because even though the means may be the same or very different, the variances being different is what matters

Slide 10.

Two last points.
First, a word of caution about the F test.
The F test requires normal population distributions. If the populations are not normally distributed, the test can easily give type one or type two errors.
For this reason, even though the F test is OK as a first check, if we really want to test for equality of variances, we should do a more complicated variance comparison test. Nevertheless, F tests are still widely used to compare variances.
Second, why do we tend to do F tests anyway?
The two-tailed F test, or a better variance test, is a pre-test for homoscedastic t-tests or other two-sample tests that require equality of variances.
The one-tailed F test is the basis of the ANOVA technique, which itself is the basis of correlation and regression. Luckily, in the case of ANOVAs, due to the central limit theorem, as long as you have reasonable sample sizes the normality problem is not as severe.

Zoom out.

I hope you found this introduction to the F test useful.
This is a fairly straightforward test that, despite its weakness if the population values are not normally distributed, is quite commonly used in the scientific and statistical literature.

End screen.

Click to like or subscribe if you found this video useful.

Connect with StatsExamples here

This information is intended for the greater good; please use statistics responsibly.