ONE SAMPLE T-TEST (INTRODUCTION)
Watch this spaceDetailed text explanation coming soon. In the meantime, enjoy our video. The text below is a transcript of the video.
Connect with StatsExamples here
LINK TO SUMMARY SLIDE FOR VIDEO:
TRANSCRIPT OF VIDEO:
The one sample T test is used to figure out if the mean of a population is what we think it is. Let's take a look at how it works and why.
We use the one sample T test when we are interested in testing a population mean.
The basic scenario is that we want to know if the population mean is a certain value, let's call it \(\mu\)0.
We can't measure the entire population, because that's impractical, so we take a random sample from it instead. We then calculate the sample mean which we can use as an estimate of the population mean but sampling error makes it inaccurate.
The question the one sample T test allows us to answer is, what is the probability the population mean is \(\mu\)0 based on the sample mean and observed variation.
Our approach will be to use a confidence interval to test for the mean of the population.
If you don't remember what confidence intervals are, then you can watch our confidence interval video which is about calculating confidence intervals and what they represent.
We compare the confidence interval we get from our sample to the hypothesized population mean \(\mu\)0
We're showing the confidence interval as a 95% confidence interval for now, because that's the most common confidence interval used, but that's not the only one possible as we'll see in a bit.
If \(\mu\)0 is inside the confidence interval, then there is a lack of evidence that the population mean is different from \(\mu\)0. That's the sort of result we would expect to see with a reasonable probability if the population mean was equal to \(\mu\)0.
On the other hand, if \(\mu\)0 is outside of the confidence interval we calculate that provides evidence that the population mean is different from \(\mu\)0. That's the sort of result we would rarely expect to see, the probability of the confidence interval not including \(\mu\)0, if that's the population mean, is very low.
As mentioned, the T test is a comparison of the confidence interval to the hypothesized population mean \(\mu\)0. In practice this isn't exactly how we do a t test, the test does it slightly indirectly
We could calculate the confidence interval and see if it includes \(\mu\)0.
Instead, we calculate the width of half of the confidence interval and compare it to the distance between the sample mean and \(\mu\)0.
If that distance is larger than half the confidence interval then \(\mu\)0 would lie outside of the confidence interval.
Comparing the distance between the sample mean and \(\mu\)0 is therefore equivalent to seeing whether \(\mu\)0 would be inside the confidence interval.
As I mentioned , we generally do this comparison indirectly however.
Looking at the equations to the right we can see that we are interested in when the distance is larger than half the confidence interval.
Calculating the distance is easy it's just the sample mean minus \(\mu\)0 .
Next up is the size of 1/2 of the confidence interval. That will be the value from our T distribution corresponding to the Alpha we desire and the degrees of freedom in our sample multiplied by the standard error.
Again, if you don't remember how to do this. I recommend checking out the confidence interval video on this channel
Then we compare these two values. And we're interested in whether the sample mean minus \(\mu\)0 Is larger than our T value times the standard error.
We can rearrange this equation slightly . Now the question is whether the fraction on the left, sample mean minus \(\mu\)0 divided by the standard error is larger than the T value corresponding to our Alpha value and degrees of freedom.
We generally call the fraction on the left our T calculated value, and we are comparing it to a T critical value.
I've written this all out as if we're looking to see whether the T calculated value is larger than the T critical value. The other side of the confidence interval would be tested by seeing whether the T calculated value is less than the negative version of the T critical value.
Diagramming it out, the T test is an indirect comparison of the confidence interval to the distance between the sample mean and \(\mu\)0.
We get our TI calculated value from the sample mean minus divided by the standard error.
Then we compare that value to a critical T value corresponding to the Alpha value for our confidence interval and the degrees of freedom for our sample.
If the T calculated value is larger in magnitude then the T critical value then \(\mu\)0 is not within the confidence interval.
If the T calculated value is smaller in magnitude then the T critical value then \(\mu\)0 is within the confidence interval.
This figure illustrates the scenario in a slightly different way. If we're doing a T test using 16 values we will use 15 degrees of freedom when we look at our T distribution.
Each of the columns in our table correspond to different confidence intervals. In this case to do the test with a 95% confidence interval we would go to our table and look for Alpha equals 0.025 so that we have 2 and 1/2% on each side outside of the confidence interval.
Then our critical values become 2.131 and negative 2.131.
If our T calculated value is larger than positive 2.131 or less than negative 2.131 then \(\mu\)0 would be outside of that confidence interval. When that happens it's very unlikely that this sample comes from a population That has a mean of \(\mu\)0.
On the other hand, if our T calculated value is between negative 2.131 and positive 2.131, that's exactly what we would expect to happen most of the time if the population mean really is \(\mu\)0.
OK, here is the formal procedure for what's called a two-tailed t-test. the two tails refer to the fact that we're testing on both sides of our confidence interval.
First, since this is a statistical test we will create a null hypothesis and an alternative hypothesis.
The null hypothesis will be that the population mean is equal to \(\mu\)0. Think of this as our baseline default assumption that we will tend to accept as probably true unless we reject it.
The alternative hypothesis will be that the population mean is not equal to . think of this as the result we would get if we decided that the null hypothesis was not true and we rejected it.
This step of specifying a null an alternative hypothesis is the first step in every statistical test.
Next, in order to figure out which of our two hypothesis has more support we will calculate our T calculated value using the equation shown.
Notice that the variance and sample size for our sample are here, they are what's used to figure out what the standard error is.
Our next step is to compare our T calculated value to various T critical values which correspond to the widths of those confidence intervals.
When we look at a table of T values each of those columns is corresponding to a different Alpha value representing the area outside that center confidence interval.
We usually try to identify what the smallest Alpha value is, which corresponds to the confidence interval with the highest degree of confidence, which would result in our calculated value being larger than the critical value.
This tells us how low the probability is, that sampling error would result in the sample mean and standard deviation that we see, if our null hypothesis was true.
This is the probability of seeing a t-calculated value as extreme as we do, if the null hypothesis is true, which is called the P value
Finally we decide to reject the null hypothesis or fail to reject the null hypothesis based on the P value we obtained.
If the P value is very small we will usually reject the null hypothesis, if it is not very small we will usually fail to reject the null hypothesis.
Remember that the null hypothesis was that the population mean is equal to \(\mu\)0 which is consistent with non small P values. That's because the T calculated value we got is what we would expect to see all the time if the null hypothesis is true.
It's the alternative hypothesis, that the population mean is not equal to \(\mu\)0, which is what would give us small P values. That's because the T calculated value we would get is not what we would expect to see if the null hypothesis is true.
This awkward approach of deciding whether or not to accept or reject the null hypothesis based on probabilities is the standard way that most statistical tests are done.
There are a couple things to keep in mind.
First we are not proving anything, we are making a decision about whether we think the null hypothesis or alternative hypothesis is true based on probability.
Second, if we don't reject the null hypothesis that is not the same thing as providing lots of support for it. What it means is that we looked for evidence against the null hypothesis and didn't find convincing evidence.
For this reason, statistical purists will always say that large P values cause you to "fail to reject a null hypothesis" , never "accept a null hypothesis". Nevertheless, people use the phrase "accept the null hypothesis" all the time, but they shouldn't.
Let's look a little bit more at what a P value represents .
P values are always in the context of a null hypothesis and an alternative hypothesis and some sort of calculation we have performed. in the case of the one sample T test it's the probability of seeing a T calculated value as extreme as we do if the null hypothesis is true.
Technically, the P value is the smallest Alpha value you could choose and still reject the null hypothesis with your data.
Conceptually, the P value is the probability of seeing the sample data you do if the null hypothesis is correct. That's why when the P value is very small you would think seriously about rejecting your null hypothesis
Let's look at this again because it bears repeating. When learning statistics, one of the biggest sources of confusion Is what AP value represents.
The conceptual definition of a P value is the probability of seeing the sample data you do if the null hypothesis is correct .
In the scenario in this video, this is equivalent to - the P value is the probability of obtaining the T calculated statistic (or more extreme) that you did if the null hypothesis is correct.
The third times the charm.
The P value of a test is the probability that the value you see could arise due to sampling error if the null hypothesis is true.
If the P value is small, usually less than 0.05, we reject the null hypothesis.
If the P value is not small, larger than 0.05 comma we fail to reject the null hypothesis.
What's written here applies to almost every statistical test and is the most useful concept in all of statistics.
OK, so we've thought about the concepts, what is the practical procedure for doing a one sample T test?
First, we create a null hypothesis an alternative hypothesis
For this test, the null hypothesis is that the population mean is equal to \(\mu\)0 and the alternative hypothesis is that it is not equal to \(\mu\)0.
Then we calculate our T calculated value using the equation shown and compare it to various T critical values.
Then we determine the P value.
for example if we got a T calculated value of 2.8 for 15 degrees of freedom what would our P value be?
If we have access to a table of T critical values , like the ones on the StatsExamples website, then we would look in the row for 15 degrees of freedom and look at the value in the columns to determine which ones bracket the 2.8.
In this case the 2.8 is larger than the critical value corresponding to an Alpha value of 0.01 but less than the critical value for an Alpha value of 0.005. Keeping in mind that we have to double these Alpha values because we are looking at both sides of the confidence interval, this would tell us that our P value is less than 0.02 but larger than 0.01.
If we were using a computer, it could provide us with an exact probability of getting a T calculated value as large as 2.8 or as small as negative 2.8. And that probability is 0.013 which we can see is larger than 0.0 one and smaller than 0.02.
In this example because we have a small P value, we would use the small P value to reject the null hypothesis.
If we think about the null hypothesis the population mean being equal to \(\mu\)0 is not consistent with a P value of 0.013 because that is less than 5% which is the usual threshold for how unlikely things have to be for us to make a decision to reject the null hypothesis.
If we think about the alternative hypothesis of the population mean not being equal to \(\mu\)0, that is consistent with a P value of 0.013 because it's exactly the sort of thing that would result in a large T calculated value
In the example we just looked at I used a probability of 5% as the threshold for making a decision about the null and alternative hypothesis.
The use of P equals 0.05 , that is 5%, as a threshold for deciding to reject the null hypothesis is arbitrary but it is the standard within statistics.
In fact, there is a specific technical term to indicate when this occurs.
We use the phrase statistically significant when a statistical test has returned a P value less than the threshold and the null hypothesis has been rejected. as mentioned, this threshold is almost always 0.05.
If our result from some test is that a sample mean of 18 is significantly different from of 20, we would reject the null hypothesis that the population mean is 20. We would conclude or decide, no prove, that the population mean is some other value
If our result from some test is that a sample mean of 18 is NOT significantly different from of 20, we would fail to reject the null hypothesis that the population mean is 20. We would lack the evidence to conclude or decide that the population mean is some value other than 20. It's not that we have strong evidence that it is 20, but that we looked for evidence that it wasn't and didn't find any.
One last point about the one sample t-test.
The one sample T test can also be one tailed instead of two tailed as we've been looking at.
For example, the null hypothesis could be that the population mean is less than or equal to \(\mu\)0, and the alternative hypothesis would be that the population mean is larger than \(\mu\)0.
In this situation when we calculate our T calculated value we would only be interested in the positive values and whether they are larger than the critical value corresponding to Alpha of 0.05.
Alternately, the null hypothesis could be that the population mean is larger than or equal to \(\mu\)0, and the alternative hypothesis would be that the population mean is less than \(\mu\)0.
In this situation when we calculate our t-calculated value we would only be interested in the negative values and whether they are less than the critical value corresponding to Alpha of 0.05.
We have to be careful when doing one tailed tests because the critical values are not as large so we are able to reject our null hypothesis more easily.
we should only do a one tail test under two conditions
First, in circumstances in which we only care about one direction. There are some situations when we only care about whether a population mean is larger or smaller then some particular value, not just different from it.
Second, we should usually only do a one tailed test when we have an a priori reason to test in only one direction . In other words, we have outside information that leads us to test in only one direction.
We cannot look at our data first and then choose one direction or another to test because that's essentially doing a two tailed test, but using the T critical values for a one tailed test which would lead to increased type one errors. Those are the ones where we reject a true null hypothesis.
Check out our video about type one and type two errors if you want to know more about that terminology.
In general, my advice is that unless we really know what we're doing we should always do two tailed tests to make sure we don't reject null hypotheses when we shouldn't.
The one sample t-test is the preferred method for testing a hypothesized population mean. In the real world however, this test is rarely done because we are usually more interested in comparing two groups to each other than comparing one group to some hypothetical value.
Those tests are 2 sample t-tests and are widely used, but to understand that method it really helps to understand the one sample t-test first.
Check out the StatsExamples website for more examples of statistical tests and links to other videos.
Connect with StatsExamples here
This information is intended for the greater good; please use statistics responsibly.