DEFINITIONS OF STATISTICAL TERMS

A priori. This is a Latin phrase that translates as "from the earlier" and it is used to refer to something that a person knows based on information separate from the data being considered. This comes into play when deciding what test we should employ such as whether a one or two-tailed test can be justified. For example, we may perform a one-tailed procedure to test whether a new disease raises the temperature of the afflicted rather than just changes it because we know from how other diseases work that they raise temperature and never lower it. It would be incorrect to look at the elevated mean temperature of our sample and then test in only one direction without this a priori knowledge.

Accuracy. This is a measurement of how close to the true value a measured value is, when the deviation arises due to systematic error or bias. This is contrasted with "precision" which is when the deviation arises due to sampling error or noise. For example, a thermometer that reads the temperature as consistently 2 degrees higher than the true temperature has poor accuracy even though it may have good precision (i.e., always reads exactly two degrees higher, without much variation).

ANOVA. (analysis of variance). This technique compares the variances of the means of various groups to the variance of the values within them to determine if the means differed by more than would be expected from sampling error. In other words, are the averages for the groups more different than expected just due to random chance? There are two main types of ANOVA, one-factor and two-factor, and they answer slightly different questions. For the one-factor ANOVA, the question is whether any of the means of the groups are different (e.g., does the mean of the first group different from the second or third). For the two-factor ANOVA, the question is whether there is a non-random relationship between labels that describe groups (e.g., low, medium, high) and the means of groups that are defined by those labels (i.e., do the means of the groups labeled "low" differ from those labeled "medium") in a case where there are two types of labels (e.g., low, medium, high versus hot or cold). The two-factor ANOVA also looks for any non-random interactions between the values of the labels. See the following definitions of those two ANOVA methods for more detail.

ANOVA, one-factor. This is a statistical technique to compare a set of different samples with the goal of determining whether the samples are drawn from populations with means that differ. Basically, this is a test to see if different groups have different means. The method works by comparing the differences between the means of the groups (estimated via the variance of the group means, MSA) to the differences between the values within the groups (estimated via the combined variances of the values within all the groups, MSW). If the means differ by more than sampling error alone would cause (MSA>MSW), then the samples are from populations with genuinely different means. If the means don't differ by more than sampling error alone would cause (MSA

ANOVA, one-way. See "ANOVA, one-factor."

Arithmetic mean. See "Mean, arithmetic."

Attribute variable. See "Variable, attribute."

Average. The average is a term for the typical or expected value in a data set. In practice the "average" is usually the arithmetic mean, but the term is also often used when the median is being described. The "average" is therefore an imprecise term and should ideally be avoided in technical or professional work. The ambiguity when describing an "average" can be used to mislead an audience.
For example, mean incomes are often much larger than typical incomes due to outlier values (i.e., billionaires), so reporting the "average" with either the mean or the median allows the incomes to be portrayed as either of two very different values depending on the agenda of the person talking.
For this reason, when considering an "average", care must be taken with regard to exactly which type of location statistic is being described.
More detail is available on topic page for summary statistics.

Bartlett's test. This is a statistical test used to determine whether the variances of more than two populations are equal or not. It uses the sample sizes and variances of samples taken from the populations to make this determination. It is sensitive to data sets that are not normally distributed however so it may give incorrect results when used with such data. Alternative variance tests for more than two populations include the F_max test, Levene's test, and Brown-Forsythe test - the latter two being resistant to problems caused by non-normal distributions.

Bias. This is variation or a pattern in the data arising from non-random and unknown sources. Statistical techniques are not designed to handle bias. The mathematical approach used in statistics tends to compare observations to the expectations due to random factors and when they differ we usually attribute the pattern to what we know is different. Bias can lead us to conclude that a process we are testing for is happening, even when it isn't, if the bias creates the pattern that matches our expectation. Bias can therefore cause Type I errors.
For example, if we are comparing the sizes of frogs in two ponds that differ in algae content and the algae influences size we hope our data will allow us to attribute sizes differences to differences in the algae content. However, if the two frog populations differ genetically from one another in a consistent and relevant manner (i.e., each pond has unique alleles that influence size), this bias can create the pattern we see, but we would interpret it incorrectly.
The key thing to remember is that this is a non-random factor, not noise or unusual individual events. Compare this description to the one for "noise" on this page.

Binomial distribution. See "Distribution, binomial."

Binomial probability distribution. See "Probability distribution, binomial"

Bivariate variables. See "Variable, bivariate."

Bonferroni correction. This is a correction to the α value used when performing multiple statistical tests. When conducting many tests, since each test has a certain probability of type I error, doing many of these tests causes the overall probability of making one or more type I errors to add up. A common and simple way to deal with this is to divide the overall α value by the number of tests being done and using this new smaller α* value to minimize the overall risk of type I error. This does have the negative effect of obscuring genuine, but barely significant, results however and this should be kept in mind.

Calculator formula for variance. See "Variance, calculator formula."

Categorical variable. See "Variable, categorical."

CD. See "Coefficient of dispersion (CD)"

Central limit theorem. This mathematical result states that if a number of samples are taken from a population the distribution of their means will be normal, no matter what the distribution of the data in the population is. This result allows the properties of the normal distribution to be used when calculating standard deviations of these means. The standard deviation of the means is then used to calculate standard errors and confidence intervals and perform statistical tests comparing means to one another.

Chi-squared distribution. See "Distribution, chi-squared""

CI. See "Confidence interval (CI)."

Coefficient of determination (R²). This value is used in regression and correlation analyses to describe the consistency of the linear relationship between two variables, X and Y. Technically, it is the proportion of variance in the Y values that is explained by the variance in X values. Conceptually, it shows how much the value of X determines the value of Y. It can range from 0 (no relationship between X and Y) to 1 (a perfect linear relationship between X and Y with no noise at all). Since this based on a proportion of two variances it is always positive in sign, regardless of the sign of the slope of the relationship. See "correlation coefficient" for a related statistic.

Coefficient of variation (CV). This is a commonly used "variation" statistic for a set of values. The CV is the standard deviation divided by the mean, usually multiplied by 100 to generate a percentage value. For example, if a data set has a standard deviation of 20 and a mean of 60, the coefficient of variation is equal to "33.33%". This puts the raw standard deviation into a context by comparing it to the mean. It's useful for comparing the variations in groups with different means because the variation of data values often scales with the overall magnitude even as the relative variation stays the same. For example, groups with larger values overall, but less relative variation (e.g., all within a more narrow percentage range), may vary more in raw terms (i.e., standard deviation) - the CV takes this into account.
More detail is available on topic page for summary statistics.

Continuous variable. See "Variable, continuous."

Correlation coefficient (r). This value is used in regression and correlation analyses to describe the consistency of the linear relationship between two variables, X and Y. Technically, it is the cross product of the X and Y values (relative to their means) divided by the square root of the product of the sums of squares of the X and Y values. Conceptually, it shows how the X and Y values are associated with one another. It can range from -1 (a perfect inverse linear relationship between X and Y with no noise at all) to 0 (no relationship between X and Y) to 1 (a perfect linear relationship between X and Y with no noise at all). The sign of r matches the sign of the slope of the data and the closer the absolute magnitude is to 1 the less noise there is in the data. See "coefficient of determination" for a related statistic.

CV. See "Coefficient of variation (CV)."

Data transformation. This describes when all the values in a data set are changed according to a mathematical function in some way to generate a new set of values. For example, applying the data transformation of squaring to the data set {3, 2, 4, 6, 1} creates the new data set {9, 4, 16, 36, 1}. There are two common reasons to do this. First, to modify the range of values so that they are easier to plot in a figure. Second, to create a data set that fulfills the assumptions for a desired statistical test. For example, taking the square root of a data set that is right skewed and has a large range will generate a less skewed data set with a much smaller range of values, making it easier to plot and more likely to be considered normally distributed for the purposes of a t test or ANOVA analysis. As another example, taking the natural logs of the values in an XY plot that shows a curved exponetial relationship will make the relationship linear and thereby appropriate for a linear regression analysis. Common data transformations include square roots, logarithms, reciprocals, and arcsines.

Definition formula for variance. See "Variance, definition formula."

Dependent variable. See "Variable, dependent."

Descriptive statistics. These are single values that are used to describe some property of the overall distribution of values form a sample. Three main properties are typically described: location (i.e., what is the typical value?), spread (i.e., how similar or different are all the values?), and shape (e.g., is the distribution of values symmetric? What does the shape look like compared to a reference distribution like the normal distribution?). These values are calculated from sample data, usually with the goal of estimating of what the corresponding values for the population are (i.e., its parameters). The term descriptive statistics may refer to summary values for a population however since the more appropriate term "descriptive parameters" is not in widespread use.

Discrete variable. See "Variable, discrete."

Distribution, probability. This is a set of values that represents the relationship between the probability of seeing an outcome (plotted on the Y-axis) versus the different outcomes (plotted on the X-axis). Typically, the system is defined such that we look at outcomes defined as successes and failures from a set of trials with the X-axis values corresponding to all possible numbers of success for the trials and the Y-axis values show the probability of each.
There are a large number of probability distributions, each of which correspond to different scenarios. These include the binomial probability distribution , the normal probability distribution , and the Poisson probability distribution .
While these probability distributions can be used to answer questions such as what the probability of a particular number of outcomes is, more often they are used to determine the probability of ranges of outcomes (e.g., more than 5 successes, between 12 and 17 successes). Most statistical tests involve calculating a test statistic and determining the probability of obtaining a value that large (or one more extreme) by using a probability distribution.

Excess kurtosis. This value is obtained by calculating the kurtosis of a data distribution and subtracting the kurtosis for a normal distribution with the same sample size. This is done to facilitate comparison to the normal distribution and identification of the distribution as leptokurtic or platykurtic. A leptokurtic distribution will have a positive excess kurtosis whereas a platykurtic distribution will have a negative excess kurtosis.
More detail is available on topic page for summary statistics.

F distribution. See "Distribution, F"

Factorial. This is a mathematical operation on integers used in some probability calculations. The factorial of N is that value multiplied by each integer smaller all the way down to one. For an integer N, the factorial of N is represented with the symbol N! By definition, 1! = 1 and 0! = 1.
For example, 5! = 5x4x3x2x1 = 120.
For example, 8! = 8x7x6x5x4x3x2x1 = 40,320.
The values of the factorial function increase very rapidly; it is often not practical to calculate the factorial for large integer

Gaussian distribution. See "Distribution, normal"

Geometric mean. See "Mean, geometric."

Goodness of fit test. See "Chi-squared test of goodness of fit."

HA (also, H_A). See "Hypothesis, alternative."

Harmonic mean. See "Mean, harmonic."

Hartley's Fmax test. See "Variance test, Hartley's Fmax."

Heterogeneous. Literally, this means the non-uniformity of a system and contrasts with the term homogenous. For example, in a broader context, this can apply to heterogeneous materials being lumpy or a heterogeneous musical piece having a variety of different tempos or melodies. In statistics this generally refers to the lack of uniformity in one or more sets of data values, most often measured via more than one variance. To avoid confusion, the term heteroscedastic is often used when discussing the lack of similarity of variances.

Heteroscedastic. This describes a situation in which the variances are different for different data sets or portions of a data set. This is in contrast to data that is homoscedastic in which case the variances are the same for different data sets or portions of a data set. For example, if there are two data sets being considered and one has very similar values and a variance of 2.1 whereas the second set has much more variable values and a variance of 38.7 we would say that these data sets are heteroscedastic. Similarly, if we are looking at an XY plot and the Y values are very similar for small X values, but very variable for larger X values, we would say this data is heteroscedastic. This is important because many statistical tests (e.g., ANOVA, linear regressions) require that the values being used in the analyses have equal variances and are not heteroscedastic. If the data set(s) in question are heteroscedastic, then only specific tests that can handle such data can be used or a "data transformation" may be applied to the data to equalize the variances and make the data set(s) homoscedastic.

Heteroscedastic t test. See "t test, heteroscedastic."

HO (also, H_O). See "Hypothesis, null."

Homogeneous. Literally, this means the uniformity of a system and contrasts with the term heterogeneous. For example, in a broader context, this can apply to homogenous materials having an even consistency or a homogenous musical piece having the same tempo and melody throughout its length. In statistics this generally refers to the uniformity in one or more sets of data values, most often measured via more than one variance. To avoid confusion, the term homoscedastic is often used when discussing the similarity of variances.

Homoscedastic. This describes a situation in which the variances are the same for different data sets or portions of a data set. This is in contrast to data that is heteroscedastic in which case the variances are different for different data sets or portions of a data set. This is important because many statistical tests (e.g., ANOVA, linear regressions) require that the data be homoscedastic and not heteroscedastic. If the data values in question are not homoscedastic, then only specific tests that can handle such data can be used or a "data transformation" may be applied to the data to equalize the variances and make the data values homoscedastic.

Homoscedastic t test. See "t test, homoscedastic."

Independent variable. See "Variable, independent."

Interquartile range (IQR). This is the difference between the first and third quartiles, IQR = Q3 - Q1. This statistic is used as a measure of variation or spread of the data. Because it is based on median-type statistics it is robust, that is, generally similar for different samples from the same population and not influenced by outliers.

IQR. See "interquartile range (IQR)."

Kurtosis. This statistic measures the shape of a data distribution via the "fourth moment" of the data distribution, a scaled sum of the fourth power of the differences between each data point and the mean. The scaling is usually by dividing the sum by the fourth power of the standard deviation to extract out the variation from the calculated value and leave just the shape aspect remaining. This value is often interpreted as measuring the "peakedness" of the distribution, how narrow or flattened the distribution is compared to the normal distribution which has a kurtosis of 3. Larger values indicate distributions that appear flattened, smaller values indicate distributions that appear more pointy. Strictly speaking it is the values in the tails that influence the kurtosis, but most people find it easier to think about the shape of the peak and this suffices for most distributions.
More detail is available on topic page for summary statistics.

Leptokurtic. This describes a distribution that has a kurtosis value larger than that of the normal distribution, reflecting a shape that is more pointy - more values close to the mean and fewer values farther away, relative to a normal distribution with the same variance.

Levene's test. See "Variance test, Levene's"

Mean (arithmetic). This is the most commonly used "average" for a set of values. This is found by summing all the values and dividing by how many there are.
For example, the arithmetic mean of 3, 4, and 5 would be (3+4+5)=12/3=4, while the arithmetic mean of 3, 4, 5, and 6 would be (3+4+5+6)=18/4=4.5.
A large number of statistical tests work with this value since a result from calculus (the Central Limit Theorem) makes the tests work. For sets of values that are all positive, the value of the arithmetic mean is always larger than both the geometric and harmonic means.
More detail is available on topic page for summary statistics.

Mean (geometric). This is a less commonly used "average" for a set of values. This is found by multiplying all the values and then taking the root - the value of the root is equal to how many there are.
For example, the geometric mean of 3, 4, and 5 would be (3x4x5)^(1/3)=60^(1/3)=3.9149, while the geometric mean of 3, 4, 5, and 6 would be (3x4x5x6)^(1/4)=360^(1/4)=4.3559.
This type of average is used when combining values that have different ranges; for example when comparing entities with three rating each (one with a 1-5 scale, one with a 1-20 scale and one with a 1-100 scale), use of the geometric mean prevents the third scale from overwhelming the values on the other two and allows all three ratings to contribute to the average compared. For sets of values that are all positive, the value of the geometric mean is always smaller than the arithmetic mean, but larger than the harmonic mean.

Mean (harmonic). This is a less commonly used "average" for a set of values. This is found by taking the reciprocal value of the mean of the set of reciprocal values of the values.
For example, the harmonic mean of 3, 4, and 5 would be 1/((1/3 + 1/4 + 1/5)/3) = 1/((20/60 + 15/60 + 12/60)/3) = 1/((47/60)/3) = 1/(0.26111) = 3.8298, while the harmonic mean of 3, 4, 5, and 6 would be 1/((1/3 + 1/4 + 1/5 + 1/6)/4) = 1/((20/60 + 15/60 + 12/60 + 10/60)/4) = 1/((57/60)/4) = 1/(0.2375) = 4.2105.
This type of average is often used when calculating average rates. For sets of values that are all positive, the value of the harmonic mean is always smaller than both the geometric and arithmetic means.

Median. This is a commonly used "average" for a set of values. This is found by listing all the values in order from smallest to largest and identifying a value that represents the middle. The simplest method picks the middle value (if there are an odd number of values) or the arithmetic mean of the center pair (if there are an even number of values).
For example, the median of 3, 4, and 5 would be the 4 in the middle, while the median of 3, 4, 5, and 6 would be (4+5)=9/2=4.5 which is the arithmetic mean of the middle two.
There are other more complicated methods to estimate the median that take the shape of the data distribution into account. There is no single universally accepted way to calculate the median. Statistical tests for medians are less common than for means because of this and the lack of mathematical results from which to derive tests based on probabilities.
More detail is available on topic page for summary statistics.

Mesokurtic. This describes a distribution that has a kurtosis value very close to 3, that of the normal distribution, reflecting a shape that is similar to the normal distribution.

Mid-range. This value estimates the location of a data set as the center value between the minimum and maximum values.
For example, for the data set {3, 6, 8, 9, 11, 15} the mid-range is (3+15)/2 = 9 and for the data set {1, 2, 2, 4, 8, 17} the mid-range is also (1+17)/2 = 9.
From this we can see that the mid-range is highly influenced by the presence of outliers and gives no information about the variation or shape of the data distribution. This statistic is only used in superficial descriptions of data and rarely used in analyses because it is so variable.
More detail is available on topic page for summary statistics.

Mode. This is a commonly used "average" for a set of values, it is the most frequent/common value in a set of values. The mode represents the most likely result if you were to randomly choose a single value from the data set. Many data sets (e.g., normal distributions) have their values clustered around the typical value and this statistic therefore does a good job, but for some data sets (especially highly skewed or asymmetric ones) the mode may be quite far from the other measures of location.
For example, in the data set {2,3,5,6,6,6,7,7,9} the mode is equal to "6" and the arithmetic mean is 5.667, but for the data set {1,1,1,3,5,6,6,7,8,12}, the mode is "1", but the arithmetic mean is "5".
More detail is available on topic page for summary statistics.

Noise. This is variation or a pattern in the data arising from random and unknown sources. Statistical techniques are generally designed to handle this sort of variation. The mathematical approach used in statistics takes noise into account by using the variation observed and the sample size in the comparison of any observed patterns, only when the pattern differs by more than would be expected from noise do we attribute the pattern to a non-random process. Noise can lead us to miss identifying the existence of a process that is present in the system under study, but is unlikely to cause us to conclude that non-random processes exist when they don't. Noise can therefore cause Type II errors. For example, if we are comparing the sizes of frogs in two ponds that differ in algae content and the algae influences size we hope our data will allow us to attribute sizes differences to differences in the algae content. However, if individuals in the two frog populations differ genetically from one another in a relevant manner (i.e., frogs in each pond have alleles that influence size), this noise can obscure the pattern caused by the algae. The key thing to remember is that this is a random factor, not bias or unknown consistent processes. Compare this description to the one for "bias" on this page.

Normal distribution. See "Distribution, normal."

Null hypothesis. See "Hypothesis, null."

p value. When performing a statistical test, the p value is technically defined as the smallest alpha value you could select and reject the null hypothesis. This probability is calculated using a specific probability distribution (e.g., t or F distributions). Conceptually, the p value is the probability that the test statistic calculated from the sample data would be as extreme as it is purely due to sampling error (i.e., statistical noise). When this value is smaller than our predetermined probability threshold, we conclude that something other than sampling error likely accounts for the observed sample data and reject the null hypothesis. The threshold of 5% is most commonly used and this is the threshold at which results are often said to statistically "significant." For example, in a test comparing the means of two groups, if the calculated test statistic has less than a 5% chance of arising purely due to sampling error, we would conclude that the means are "significantly different" from one another.

Platykurtic. This describes a distribution that has a kurtosis value smaller than that of the normal distribution, reflecting a shape that is flattened - fewer values close to the mean and more values farther away, relative to a normal distribution with the same variance.

Poisson distribution. See "Distribution, Poisson"

Pooled within-group variance. See "Variance, within-group pooled"

Precision. This is a measurement of how close to the true value a measured value is, when the deviation arises due to sampling error or noise. This is contrasted with "accuracy" which is when the deviation arises due to systematic error or bias. For example, a thermometer that reads the temperature up to 2 degrees higher or lower than the true temperature, but where the average is the same as the true value has poor precision even though it may have good accuracy (i.e., the average value is exactly correct).

Probability distribution. See "Distribution, probability"

Qualitative variable. See "Variable, qualitative"

Quantitative variable. See "Variable, quantitative"

r. See "Correlation coefficient (r)."

R². See "Coefficient of determination (R2)."

Range. This is a "variation" statistic for a set of values. This value is the distance between the largest and smallest values, calculated as difference between the maximum and minimum value in the data set.
For example, for the data set {8, 11, 13, 14, 16, 20} the range is 20-8 = 12 and for the data set {1, 2, 2, 4, 8, 13} the range is also 13 - 1 = 12.
From this we can see that the range is highly influenced by the presence of outliers and gives no information about the location or shape of the data distribution. This statistic is only used in superficial descriptions of data and rarely used in analyses because it is so variable.
More detail is available on topic page for summary statistics.

Ranked variable. See "Variable, ranked."

Response variable. See "Variable, response."

SD. See "Standard deviation."

Significant digits. See "significant figures."

Standard deviation (SD). This is a commonly used "variation" statistic for a set of values. This value is calculated as the square root of the variance (see "variance"). When the data is normally distributed approximately 66% of the values are within one standard deviation of the mean, 95% of the values are within two standard deviations of the mean, and 99.9% of the values are within three standard deviations of the mean.
More detail is available on topic page for summary statistics.

Student's t distribution. See "Distribution, Student's t"

Variable. This is a value from a data set that is being analyzed. Statistical tests and procedures are designed to work with specific types of variables. Most often a variable is a numerical number, but sometimes it is a property or other descriptor.

Variable, continuous. This is a variable that can take any quantitative value within a range, theoretically an infinite number of such values would be possible. Examples of continuous variables include length, mass, and temperature. In practice the measurement method used will create a non-infinite set of possible values, but as long as the possible number of values is extremely high then the variable can be considered continuous and therefore appropriate for mathematical techniques that use assumptions about continuous functions (i.e., calculus based methods).

Variable, discrete. This is a numerical variable where the values consist exclusively of whole numbers and the values represent magnitude. Examples of discrete variables include counts of the occurrence of events, number of definable objects (e.g., scales, visible bands in a gel, individuals in a herd), or duration of time when rounded to whole minutes or hours.

Variable, qualitative. This is a variable with non-numerical values. Examples of qualitative variables can include colors, named categories, rough magnitudes (e.g., small, medium, large or cold, warm, hot). The key thing is that the values are categories, not numbers. Contrast this with quantitative variables.

Variable, quantitative. This is a variable with numerical values. Examples of quantitative variables include be integers, fractions, percentages, decimals, etc. Contrast this with qualitative variables.

Variable, ranked. This type of variable gives the rank of the value compared to others. The order of the ranks has meaning, but the difference between the ranks does not need to be equal. For example, for a set of ranked values it may be the case that the difference between first and second is great, but the difference between nineteenth and twentieth may be small. This which is an important distinction between ranked variables and other quantitative variables.

Variance. This is a commonly used "variation" statistic for a set of values. To calculate this value, two steps are required. First, each value in the data set is compared to the mean by squaring the difference and these values are all summed up. Second, this sum is divided by the number of data values if the data set is a population or divided by one less than the number of data values if the data set is a sample. In almost all cases, since the purpose of the vast majority of statistical calculations involves using a sample to estimate a population value, the second of these methods is used.
More detail is available on topic page for summary statistics.

Variation. This term is not a technical term in statistics. A number of other terms describe how to calculate the "variation" or "spread" of values in a given data set; examples of these include: variance, standard deviation, range, and IQR.

Connect with StatsExamples here

This information is intended for the greater good; please use statistics responsibly.

about contact privacy credits