GLM: Unequal n Designs

 

Reading: SPSS Base 9.0: 20. GLM Univariate Analysis
Homework:
Download: glm_uneqn.sav       (Download Tips)

  1. Overview
  2. Rule of Thumb
  3. Weighted and Unweighted Means
  4. Which Means are Displayed by GLM?
  5. Sums of Squares Types in GLM
  6. Post Hoc Tests for Unequal ns Using GLM General Factorial
  7. Simple Main Effects for Unequal n Designs
  8. Running the Tukey HSD Test by Hand
  9. Interaction Effects: Creating a Oneway Effect from an Interaction

1. Overview

There are three different kinds of unequal n problems in analysis of variance. One is the case where you have unequal n's for different levels of a factor, but the unequal ns are proportional. For example you may have twice as many female subjects as male subjects, but each cell of the design has twice as many females as females. A second case is where an entire cell of the design is missing for some reason. A third case is where you intended to have equal ns but some subjects did not show up, or you had an equipment failure, or for any number of other reasons you ended up with some people in every cell, but not an equal number of people in each cell. The purpose of this set of notes is provide a guide for the third case, that is, the case where you happen to have unequal ns in the cells of your experiment because of random occurrences, not because you designed your study that way.

top


2. Rule of Thumb

RULE OF THUMB

If you have an unequal n design, then use
GLM

with sums of squares
Type III
( the default method in GLM).

Report the
Estimated Marginal Means

(the unweighted means).

Run post hoc and simple main effects using the
Compare Main Effects
option under Estimated Marginal Means.

The remainder of this set of notes provides the rationale for this recommendation.

top


3. Weighted and Unweighted Means

Here is what a date set with missing values might look like. Notice the frequencies within each of the cells.

Table 1. Data with Unequal ns

B
1 2 3 4 Total
A 1 Valid n n = 2 n = 5 n = 3 n = 7 nA1 = 17
Sum 10 19 22 53 SumA1 = 104
Mean 5.00 3.80 7.33 7.57 MA1 = 6.12
2 Valid n n = 5 n = 4 n = 4 n = 2 nA2 = 15
Sum 8 12 22 20 SumA2 = 62
Mean 1.60 3.00 5.50 10.00 MA2 = 4.13
Total Valid n nB1 = 7 nB2 = 9 nB3 = 7 nB4 = 9 N = 32
Sum SumB1 = 18 SumB2 = 31 SumB3 = 44 SumB4 = 73 Sum.. = 166
Mean MB1 = 2.57 MB2 = 3.44 MB3 = 6.28 MB4 = 8.11 GM = 5.19

The main effect means (MA1, MA2, MB1, etc.) are found by adding the sum of the scores for everyone in that level (e.g. everyone in level B1) and dividing by the total number of people in that level. For example, the B1 mean is

    MB1 = (SumA1B1 + SumA2B1 / (nA1B1 + nA2B1)
            = (10 + 8) / (2 + 5)
            = 18 / 7
            = 2.57

Rather than using the sums and ns to compute the marginal means you can compute the marginal means from the individual cell means. There are two alternative strategies when using the cell means to create the marginal means: the means can be created as a weighted mean or as an unweighted mean

The Weighted Mean

The weighted mean is found by weighting each of the cell means by its respective n (i.e., multiply each cell mean by its n) and then dividing by the total n for that marginal mean. For example, the mean for column B1 would be found by taking the mean for A1B1 (MA1B1); multiplying that mean by the n for that cell (nA1B1); taking the mean for cell A2B1, multiplying that mean by the n for that cell, then summing those weighted means and dividing by the total number of cases in B1.

 Weighted MB1 =

     = ((nA1B1 * MA1B1) + (nA2B1 * MA2B1)) /  (nA1B1 + nA2B1)      
     = ((2 * 5) + (5 * 1.6)) / (2 + 5)
     = (10 + 8) / (2 + 5)
     = 18 / 7
     = 2.57    

A "weighted" mean is a mean that is weighted by the frequencies in each cell. In computing the mean for B1, for example, the mean for the five people in cell A1B1 was given less weight than was the mean for the two people in cell A2B1. This biases mean for B1 towards the mean of cell A2B1

The weighted row and column means for this data are shown in Table 2. 

Table 2. Data with Unequal ns, Weighted Means

B
1 2 3 4 Total Weighted
Mean
A 1 Valid n n = 2 n = 5 n = 3 n = 7 n = 17  
Sum 10 19 22 53 104  
Mean 5.00 3.80 7.33 7.57 6.12 6.12
2 Valid n n = 5 n = 4 n = 4 n = 2 n = 15  
Sum 8 12 22 20 62  
Mean 1.60 3.00 5.50 10.00 4.13 4.13
Total Valid n n = 7 n = 9 n = 7 n = 9 n = 32  
Sum 18 31 44 73 166  
Mean 2.57 3.44 6.28 8.11 5.19  
Weighted
Mean
  2.57 3.44 6.28 8.79   5.19

As shown in Table 2, the weighted means are identical to the means presented in Table 1.  That is, the weighted means are identical to means found by finding the sum of scores in the relevant cells and then dividing by the number of cases in those cells.  This is the "normal" way of finding the mean, it gives more weight to those cells who happen, by chance, to have more people in them. 

The Unweighted Mean

Another way of finding the main effect means would be to find the average of the cell means without weighting the means by the number of people in each cell. For example, the unweighted mean for B1 would be the average of the means for cells A1B1 (5.00) and cell A2B1 (1.60)..

   Unweighted MeanB1 =  (MEANA1B1 + MEANA2B1) / 2
                        = (5.00 + 1.60) / 2
                        = (6.60) / 2
                        =  3.30   

The unweighted mean for B1 is not biased towards the cell with the largest n. The weighted and unweighted means for this date set are shown in Table 3.

Table 3. Data with Unequal ns, Weighted and Unweighted Means

B
1 2 3 4 Weighted
Mean
Unweighted
Mean
A 1 Valid n n = 2 n = 5 n = 3 n = 7    
Sum 10 19 22 53    
Mean 5.00 3.80 7.33 7.57 6.12 5.93
2 Valid n n = 5 n = 4 n = 4 n = 2    
Sum 8 12 22 20    
Mean 1.60 3.00 5.50 10.00 4.13 5.03
Weighted
Mean
  2.57 3.44 6.28 6.11 5.19  
Unweighted
Mean
  3.30 3.40 6.42 8.79   5.48

Analysis of variance partitions the each individual's score into two components, the predictable part, which is the distance from the grand mean to the mean of that person's group, and the part that is error, the distance between the group mean and the individual's score. Some methods of analysis of variance use the weighted mean as the group mean, while other methods use unweighted mean as the group mean.

top


4. Which Means are Displayed by GLM?

You can display the means in the GLM - General Factorial procedure in one of two ways, both options are accessible from the Options... dialog box. You can check the Descriptive Statistics box in the Display section at the bottom left of the dialog box. Or, at the top of the dialog box, you can ask to display the Estimated Marginal Means by moving the appropriate main effects and interactions to the Display Means for... window. The output from the descriptive statistics option is shown in Table 4. The output for the estimated marginal means option is shown in Tables 5 (A main effect), 6 (B main effect), and 7 (A *B interaction).

Table 4. Descriptive Statistics
A B Mean S.D. N
1 1 5.00 1.41 2
2 3.80 .84 5
3 7.33 .58 3
4 7.57 .98 7
Total 6.12 1.93 17
2 1 1.60 .55 5
2 3.00 .82 4
3 5.50 .58 4
4 10.00 1.41 2
Total 4.13 2.92 15
Total 1 2.57 1.81 7
2 3.44 .88 9
3 6.29 1.11 7
4 8.11 1.45 9
Total 5.19 2.61 32
The eight means for the A*B interaction are the same those shown in the earlier tables (cf. Table 3). Of interest here are the summary means indicated as Totals in Table 4.

The mean for Total in row A1 (M = 6.12) is the weighted mean for level 1 of the A main effect. The mean for Total in row A2 (M = 4.13) is the weighted mean for level 2 of the A main effect (cf. Table 3).

The B main effect means are displayed in the Total section at the bottom of Table 4. Total1 (M = 2.57) is the weighted mean for level 1 of the B main effect. Total2 (M = 3.44) is the weighted mean for level 2 of the B main effect. Total3 (M = 6.29) is the weighted mean for level 3 of the B main effect. Total4 (M = 8.11) is the weighted mean for level 4 of the B main effect. And Totaltotal (M = 5.19) is the weighted grand mean.

The summary statistics provided by the Descriptive Statistics option are the weighted means.

Table 5. Estimated Marginal Means for the A Main Effect
Independent Variable A Mean Std. Error
1 5.93 .231
2 5.03 .233
The estimated marginal means for the A and B main effects are the unweighted means (cf. Table 3).

The summary statistics provided by the Estimated Marginal Means option are the unweighted means.

Table 6. Estimated Marginal Means for the B Main Effect
Independent Variable B Mean Std. Error
1 3.30 .356
2 3.40 .285
3 6.42 .325
4 8.79 .341

 

Table 7. Estimated Marginal Means for the A*B Interaction
Independent Variable A Independent Variable B Mean Std. Error
1 1 5.00 .602
2 3.80 .381
3 7.33 .491
4 7.57 .322
2 1 1.60 .381
2 3.00 .426
3 5.50 .426
4 10.00 .602
The estimated marginal means for the highest order interaction are the same as the descriptive means for the highest order interaction (cf. Table 3).

The highest order interaction means are not computed from other means, so there is no difference between weighted and unweighted means.

Note: This distinction between weighted and unweighted means only applies when the design in unbalanced, that is when cells ns are unequal. When the design is balanced, when the cell ns are equal and there are no missing cells, then there is no difference between the means displayed by the Descriptives output and those displayed by the Estimated Marginal Means output.

In summary, the answer to the question posed in this section, "Which Means are Displayed by GLM?," is: both weighted and unweighted means are displayed. The Descriptives option displays the weighted means and Estimated Marginal Means option displays the unweighted means.

top


5. Sums of Squares Types in GLM

The next question is, so what? Why is this emphasis on unequal ns and weighted and unweighted means important? One answer is that if you have an unequal n design, then you need to be careful about the method you choose for computing the sums of squares and about which means you report.

GLM provides four different methods for computing sums of squares, Type I, Type II, Type III, and Type IV. SPSS Base 9.0 defines the four methods as follows (see pp. 264-265).

Type I (hierarchical decomposition). Each term is adjusted only for the terms that precede it on the DESIGN subcommand. If the design is balanced (if there are equal ns in each cell and there are no missing cells) then the sums of squares in the model add up to the total sums of squares.

Type II. Calculates the sums of squares of an effect in the model adjusted for all other "appropriate" effects where an appropriate effect is an effect that does not contain the effect being examined. For example, in a three way ANOVA, A x B x C, the main effect of A would be adjusted by the B and C main effects and by the B by C interaction.

Type III (the default). Calculates the sum of squares of an effect adjusted for all other effects that do not contain it, and orthogonal to any effects that contain it. "The Type III sums of squares have one major advantage--they are invariant with respect to the cell frequencies as long as the general form of estimability remains constant. Hence, this type of sums of squares is often used for an unbalanced model with no missing cells" (emphasis added, p. 265).

Type IV. Designed for the situation in which there are missing cells.

The results of 5 different analyses using different sums of squares types and different orders of the variables on the design subcommand are shown in Table 8. The order in which the variables are entered yields very different results in this unequal n design when SS Type I is used. The outcome depends upon which variable you happened to have entered first when you defined the factors and/or the model. The order in which the variables are entered into the design does not have an effect on either Type II or Type III sums of squares types.

Table 8. ANOVA Results for an Unequal N Design
Using Different SS Types and Different Orders of the Variables
F values
SS Type Order of Variables in the Design A B A x B
Type I A, B, A x B 43.32 61.88 12.74
B, A, B x A 7.18 73.93 12.74
A x B, A, B 0.00 0.00 38.17
Type IIa A, B, A x B 7.18 61.88 12.74
Type IIIa, b A, B, A x B 7.55 64.00 12.74
aThe order of the variables in the design subcommand does not influence the reported F values.
b Type III is the default sums of squares type in GLM

Note that SS Type III is the preferred sums of squares type if you have unequal ns. Furthermore, the hypotheses tested with TYPE III sums of squares are hypotheses about the unweighted means. So, you should report unweighted means rather than weighted means when you have an unequal n design. That is, you should report the means displayed by the Estimated Marginal Means option.  You should not report the means displayed by the Descriptive Statistics option.

top


6. Post Hoc Tests for Unequal ns Using GLM General Factorial

There are two options in GLM General Factorial for running post hoc tests on the main effects.  One way to run post hoc tests is to use the Post Hoc... dialog box to set up the post hoc tests. The other way is found in the Options... dialog box as a check box in the Estimated Marginal Means section called Compare Main Effects. Here is a comparison of those two possibilities.

Table 10. A Comparison of Pairwise Tests Provided in GLM General Factorial
  from "Post Hoc... " from "Compare Main Effects"
Means used in the analysis "observed" means
(weighted means)
estimated marginal means
(unweighted means)
Post hoc tests for main effects? Yes Yes
Post Hoc Test Options numerous options: LSD, Tukey, Scheffe, Duncan, etc. 3 options: LSD(uncorrected), Bonferroni, and Sidak
Post hoc tests for interactions? No No
Simple Main Effects Available? No Yes

Recall that the hypothesis for Type III sums of squares are hypotheses about unweighted means. The tests provided in the Post Hoc... dialog box are inappropriate because they use weighted means rather than unweighted means.

WARNING!

Do not use the tests provided by the Post Hoc... dialog box if you have more than one factor in your design and if you have unequal ns.

If you have only one factor in your design, then Post Hoc... is appropriate for both equal and unequal n designs. 

The tests provided by the Compare Main Effects options correctly uses the estimated marginal means (the unweighted means in this case).  You can run uncorrected tests on the main effects (LSD), or you can correct the confidence intervals by the Bonferroni method or the Sidak method.

To run the main effect test go to the Estimated Marginal Means section of the Options... dialog box and move the factor name to the Display Means for: window. Check the Compare main effects box, select the correction for the confidence intervals, and run the analysis.

top


7. Simple Main Effects for Unequal n Designs

Simple main effects can be run using the /EMMEANS syntax as described in the Simple Main Effects notes.

If you want to analyze the simple main effects of a at each level of b the syntax is

/EMMEANS TABLES (a*b) COMPARE (a) ADJ(SIDAK)

If you want to analyze the simple main effects of b at each level of a the syntax is

/EMMEANS TABLES (a*b) COMPARE (b) ADJ(SIDAK)

To help in the interpretation you should also display the profile plots or plots with error bars for the significant effects.

top


8. Running the Tukey HSD Test by Hand

If you want to run post-hoc tests on main effects or interactions by hand using the Tukey HSD procedure you should follow these guidelines.

1. Use the estimated marginal means (the unweighted means).

2. Use the harmonic n, Nh, when calculating the critical difference for the Tukey test. The rationale for using the harmonic n is that the Type III sums of squares is based on the unweighted mean. By using the Type III sums of squares we have decided that the random differences in cell ns are unimportant. So we should not reinsert those ns back into the analysis when we run our post hoc tests.

For equal n designs the critical difference, y(HSD), is

where, MSerror is the MSerror from the analysis of variance; n is the number of cases in each cell; and qa,p,v is obtained from the Percentage Points for the Studentized Range Statistic table at a given significance level, a, with p means, and v degrees of freedom for the MS error term.

If you have an unequal n design then substitute the harmonic n, Nh, for n in the critical difference formula.

The harmonic n is found by the following formula

Nh = p / (1/N(1) + 1/N(2) + ... + 1/N(p)) 

where p is the number of cells in the ONEWAY analysis.

For the data used in this set of notes, the harmonic n for testing the interaction effect would be

Nh = 8 / (1/2 + 1/5 + 1/3 + 1/7 + 1/5 + 1/4 + 1/4 + 1/2 )
     = 8 / (.5 + .2 + .333 + .143 + .2 + .25 + .25 + .5)
     = 8 / 2.376
     = 3.37

The harmonic n for testing the B main effect would be

Nh = 4 / (1/7 + 1/9 + 1/7 + 1/9 )
      = 4 / (.143 + .111 + .143 + .111)
      = 4 / .508
      = 7.88

The harmonic n for testing the A main effect would be

Nh = 2 / (1/17 + 1/15 )
      = 2 / (.059 + .067)
      = 2 / .125
      = 15.94

top


9.  Interaction Effects: Creating a Oneway Effect from an Interaction

In an earlier set of notes (see GLM: 2way ) we proposed that all pairwise comparisons among the means in an interaction effect could be made by first creating a main effect variable from that interaction and then running the GLM Post Hoc... option on that main effect variable.  Is that a reasonable procedure when you have an unequal n design?

That procedure will work for the highest order interaction because the means for the highest order interaction are the observed means.  The significant 2-way interaction found in this set of notes could be analyzed using that procedure.

When you use a set of IF statements to create a oneway variable for a lower order interaction (e.g., a 2-way interaction in a design with three factors), the means for that oneway variable will be the weighted means rather than unweighted means.  Recall that in an unequal n design GLM uses the unweighted means when computing the sums of squares.  Hence, you would be testing the wrong means if you used the oneway variable to run the Post Hoc tests for a lower-order interaction.  You would be testing for differences between the weighted means rather than differences between the unweighted means.

The appropriate post hoc procedures for lower order interactions in an unbalanced design include running the Tukey test by hand, or running simple main effects analyses.

If you have an equal n design, the observed means are the same as the estimated marginal means.  Therefore you could create a oneway variable for any of the interaction terms and use that variable to run all pairwise comparisons using the GLM Post Hoc... option. 

top


ŠLee A. Becker, 1997-1999  -revised 11/30/99