Basic statistics and the inconsistency of multiple comparison procedures

Canadian Journal of Experimental Psychology, Sep 2003 by Saville, David J

The main reason that I prefer the simplest of formal procedures is the "inconsistency" of the other multiple comparison procedures. In brief, I call a procedure "inconsistent" if the probability of judging two treatments to be different depends on either the number of treatments included in the statistical analysis, or on the values of the treatment means for the remaining treatments. More precisely, in Saville (1990) I call a procedure "consistent" if the "decision it generates as to whether two population means are different is dependent only on (a) the difference between the two sample means, (b) the standard error of this difference, (c) the number of error degrees of freedom, and (d) the significance level at which the procedure is operated" (p. 177). To illustrate the undesirability of inconsistency, I shall now present an example.

An Example

Suppose (fictitiously) that we have 52 treatment programs for problem gamblers that have been included in a trial involving $12 problem gamblers, with 16 gamblers randomly allocated to each treatment program. Each gambler is subjected to a battery of psychological tests prior to the treatment program, and again at the completion of the program. The data we analyze is the increase in a standardized score that is a total over all of the tests included in the battery of psychological tests (with scales converted so that a low value corresponds to a poor psychological state, and a high value corresponds to a good psychological state).

Suppose that the (fictitious) mean increases in score for the 32 treatment programs, sorted into ascending order, are 158, 159, 161, 163, 164, 166, 167, 167, 109, 169, 170, 170, 171, 173, 173, 174, 175, 175, 176, 176, 179, 180, 182, 182, 183, 183, 185, 185, 186, 188, 189, and 190, with a pooled variance estimate of s^sup 2^ = 354.6 (pooled SD = s = 18.8) with 480 residual degrees of freedom. The common SEM is 4.708, the common SED is 6.658, and the LSD(5%) is 13.1.

I shall use this fictional data set to illustrate the differences between the following multiple comparison procedures: Bonferroni procedure, Tukey's honest significant difference (HSD) procedure, Student-Newman-Keuls' multiple range test (MRT), Fisher's restricted LSD procedure, Duncan's multiple range test, and the unrestricted LSD procedure. I shall especially focus on the significance of the difference between two of the most popular treatment programs, referring to them as programs A and H (with mean increases in score of 161 and 186, respectively); the data for these two programs are displayed as histograms in Figure 3. For each MCP, I shall first analyze the full data set, then a subset of 13 treatments (including A and B), then a subset of four treatments (including A and B), and lastly a subset of just the two treatments A and B. I have artificially arranged that in all of these analyses, the pooled variance estimate is s^sup 2^ = 354.6, so the SEM is 4.708 and the SED is 6.658 in all analyses. The residual d.f. varies from 480 to 195 to 60 to 30 in the four analyses, and as a result, the LSD(5%) varies within the range 13.1 to 13.6. All analyses are carried out using the analysis of variance and "allpairwise" routines in the statistical package Genstat (Genstat Committee, 2002).


 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
Click Here
advertisement
  • Click Here
  • Click Here
  • Click Here
advertisement

Content provided in partnership with ProQuest