**Use of multiple comparison analysis tests**

*“pairwise”*differences. ANOVA does not provide tests of pairwise differences. When the researcher needs to test pairwise differences, follow-up tests called

*post hoc*tests are required.

*post hoc*testing. Therefore, using t-tests to examine pairwise differences is likely to overestimate the size of the individual t-tests. This means that the sum of t-values from all the pairwise t-tests will often exceed the value of the t-statistic produced by one of the multiple comparison analysis statistics (2). As a result, performing multiple t-tests will lead the researcher to a higher probability of making a Type I error. That is, the researcher is much more likely to report significant differences between some of the pairs that have no real difference (1).

*multiple comparison analysis*. One of the multiple comparison analysis statistics should be used to examine pairwise and subgroup differences after the full ANOVA has found significance. The key tests of pairwise differences include: Bonferroni, Sheffèe, Tukey, Newman-Keuls and Dunnett.

**Categories of contrasts**

*Experimental Group 1*and

*Control Group 2*. A complex contrast is a test of the difference between combinations of groups. An example of a complex contrast is a test of the difference between a subgroup created by combining

*Experimental Groups 1, 2*and

*4*combined

*,*and a subgroup created by combining

*Control Groups 1*and

*3.*The purpose of ANOVA is to either test theory or to generate theory, and multiple comparison analysis may be used to support either purpose.

**Tests for comparing pairs**

*The Tukey method*

*q”*statistic to determine whether group differences are statistically significant. The “

*q*” statistic is obtained by subtracting the smallest from the largest mean, and dividing that product by the overall group standard error of the mean (4). The overall group standard error of the mean divided by the sample size is known as the Mean Square Within (MS

_{w}) and is a satistic provided by the ANOVA output in virtually all statistical analysis programs (5). The

*q*value can be compared to the values on a table of

*q-values*to determine if the

*q-value*from a particular pair exceeds the critical

*q-value*needed to achieve statistical significance. If the

*q*value meets or exceeds the critical value, that pair’s difference is statistically significant.

*q-value*is not significant. No others need be tested because they will not be significant. Tukey uses a fairly conservative estimate of alpha. It tests all the contrasts as a family and thus has a bit less power to find differences between pairs. In this context,

*family*refers to the

*familywise error rate*(6). This term addresses the likelihood of making a Type I error and thus a false discovery.

*Family*tests reduce the possibility of making a false claim of significance (6), and should be used when the consequences of falsely reporting a significant difference are greater than the consequences of not finding a difference. Family tests provide more confidence in the results because such tests make few Type I errors (5,7).

*The Newman-Keuls method*

**Tests for comparing multiple groups**

*The Scheffee method*

*post hoc*tests. The Scheffee is a good exploratory statistic because it tests all possible comparisons. As a result, it allows the researcher to observe which groups or combinations of groups produced the significant difference found in the original ANOVA test. This is one method of exploratory data analysis, which is a strategy for discovering previously unknown differences among study groups, or for discovering if hypotheses based on very limited theory can be supported.

*The Bonferroni (Dunn) method*

*family*contrasts comparison method, so it does not inflate alpha to the extent that other types of multiple comparison analyses (such as the Newman-Keuls method) do. Additionally, like the Scheffee method, the Bonferroni method can test complex pairs. However, the Bonferroni statistic is not a tool for exploratory data analysis. It requires the researcher to specify all contrasts to be tested in advance. The researcher must have sufficient theory about the phenomena of interest in order to know which contrasts to specify. As a result, this is a better test for confirming theory about the experimental group’s results than exploratory methods such as the Scheffee. Because Bonferroni limits the number of tests to those specified in advance by the researcher, it reduces the problem of alpha inflation. The great advantage of the Bonferroni method is that it reduces the probability of a Type I error by its limits on alpha inflation. However, it cannot make serendipitous discoveries and it therefore provides less information on differences among the groups because not all differences are tested.

*The Dunnett method*

**Summary**

*post hoc*tests available to further explicate the group differences that contribute to significance in an ANOVA test. Each test has specific applications, advantages and disadvantages (Table 1). It is therefore important to select the test that best matches the data, the kinds of information about group comparisons, and the necessary power of the analysis. It is also important to select a test that fits the research situation in terms of theory generation versus theory testing. The consequences of poor test selection are typically related to Type 1 errors, but may also involve failure to discover important differences among groups. Multiple comparison analysis tests are extremely important because while the ANOVA provides much information, it does not provide detailed information about differences between specific study groups, nor can it provide information on complex comparisons. The secondary analysis with these post hoc tests may provide the researcher with the most important findings of the study.

**Potential conflict of interest**

*Table 1. Comparison of different multiple comparison analysis statistics.*