| Overview Possible Outcomes in ANCOVA An Example Testing the Equivalent Slopes Assumption Running the ANCOVA Adjusted Means |
![]()
| Lets reconsider our two group (treatment vs. control) quasi-experimental
design. Hypothetical scatterplots of the scores for two groups are shown in Figure 1. The
pretest and posttest means of each group are shown by lines that drop to each axis. The
pretest scores are not equivalent for the two groups, the mean pretest score of the
treatment group, MX.T, is higher than the mean pretest score of the
control group, MX.C. Although the posttest mean for the treatment
group, MY.T, is higher than the posttest mean for the control
group, MY.C , we don't know to what extent that difference is due to
the difference in the pretest scores. The analysis of covariance (ANCOVA) asks the question, If you hold constant the pretest scores is there a significant differences between the posttest scores for the two groups? |
|
The question is answered by looking at the regression lines for the two groups and asking whether the regression line for one group is significant elevated over the regression line of the other group. ANCOVA assumes that the slopes of the regression lines are equivalent. This assumption must be tested prior to running the ANCOVA. If the slopes are equivalent, then the test of whether one regression line is elevated over the other line can be made by testing whether the difference between the intercepts (aT and aC) of the two regression lines is significantly different from zero. See the notes on correlation and regression for definitions of these terms.
In the example shown in Figure 1 the intercept for the treatment group, aT, is higher than the intercept for the control group, aC. Thus, after controlling for the pretest differences there was still a significant difference between the posttest scores.
It is important to notice the relative size of the error terms for the ANOVA of the posttest scores and the ANCOVA of the posttest scores using the pretest scores as the covariate. The ANOVA sums of squares for the error term is found as the sum of the squared distances of each score, Y, from the group mean for that score, MY.T or MY.C.
| ANOVA SSerror = S(YT - MY.T)2 + S(YC - MY.C)2 |
The ANCOVA sums of squares for the error term is found as the sum of the squared distances of each score, Y, from its predicted score, Y'Y.T or Y'Y.C.
| ANCOVA SSerror = S(YT - Y'.T)2 + S(YC - Y'.C)2 |
If the regression between the pretest and posttest scores is significant, then the error term for the ANCOVA will always be smaller than the error term for the ANOVA. The larger the correlation between the covariate (pretest scores in this case) and the dependent variable (posttest scores in this case) the smaller the ANCOVA SSerror relative to the ANOVA SSerror.
The data in Figure1 represent the case where the pretest scores were higher for the treatment group than for the control group. After using ANCOVA to control for these pretest differences there was still a significant difference favoring the treatment group. The intercept of the treatment group was significantly greater than intercept of the control group.
The data in Figures 2 and 3 represent cases where the mean pretest scores are lower for the experimental group than for the control group. In each instance the mean posttest scores are lower for the experimental group than for the control group. This might represent the case where the treatment was intended to lower, say, anxiety and high scores on the outcome measure indicates high anxiety.
| In Figure 2 the ANCOVA indicates that there is no difference in posttest scores between the two groups. The regression lines overlap each other and the intercepts of the two regression lines, aT and aC, are equal. |
|
| In Figure 3 the ANCOVA indicates that the posttest differences between the two groups is still significant.. The Y intercept for the treatment group, aT, is significantly lower than the Y intercept for the control group, aC. |
|
| In Figure 4 the ANCOVA indicates that the posttest differences between the two groups are significant, but that the treatment groups scores are higher than those of the control group. The intercept for the experimental group is significantly higher than the intercept for the control group. Notice the unusual results in this instance. The overall mean of the posttest scores was lower for the treatment group than for the control group. But the ANCOVA indicates that the experimental group scored higher than the control group. |
|
| The data in Figure 5 are from a hypothetical quasi-experiment where intact
groups were either given immediate psychotherapy (treatment group) or no psychotherapy
(control group) between the pretest and posttest observations. The dependent variable is the trait subscale of Spielberger's State-Trait Anxiety Index (STAI-T). A low score on the STAI-T indicate low anxiety. The green horizontal line is the posttest mean of the control group. The red horizontal line is the posttest mean of the treatment group. The posttest mean of the treatment group (M = 38.42, SD = 13.98) is not significantly different from the posttest mean of the control group (M = 44.23, SD = 12.43), F(1, 77) = 3.80, p = .055. |
|
There are significant pretest differences. The pretest mean of the treatment group (M = 54.35, SD = 12.27) is greater than the pretest mean of the control group (M = 45.87, SD = 12.38), F (1, 77) = 9.34, p = .003.
| The ANCOVA model assumes that the slopes of the regression
lines are the same for each group. That is, the slopes should be parallel. Figure 5 shows the regression lines for each group. Visual inspection if the regression lines suggest that they are, indeed, parallel. The regression equations are: Y'Tx = -4.93 + .80X Y'Ctl = 8.01 + .79X The slopes of the two regression lines are nearly identical. |
|
| Figures 1 through 4 showed idealized data where the slopes of the control
and treatment groups were parallel. An example of unequal slopes is shown in Figure
7. When the slopes are parallel, then you can answer the question of whether there are differences between the two groups by estimating the predicted posttest scores for the two groups at any given pretest score. The answer will be the same at any given pretest score. When the slopes are not parallel then the predicted posttest scores for the two groups depend upon which pretest score was selected. In Figure 7 if you chose the Y intercepts or some other low pretest score then you might conclude there are no differences between the conditions. Whereas if you choose a high pretest score you might conclude that there were significant differences between the groups. |
|
In actual data it is unlikely that the slopes are going to be exactly equal. To solve this problem an ANCOVA finds a pooled slope. A pooled slope is similar to an average of the two slopes. ANCOVA then uses a least squares procedure to make the best fit of a line that has the pooled slope. It does this for each group in the study. If the slopes are not too different from each other then the pooled slope procedure is reasonable. But if the slopes for the different conditions are very different from each other then the pooled slope will be a poor estimate of the data for each of the conditions.
Eyeballing provides an expectation about whether or not the slopes are equal. In many cases the slopes may not be exactly equal. Whenever you run an ANCOVA you should run a statistical test for equality of the slopes of the regression lines.
| The SPSS syntax used for testing the equality of regression
slopes is shown in Table 1. . The design is set up as an analysis of covariance with the the posttest (tanxpost) as the dependent variable, assignment to group (BY treatgrp) as the independent variable and the pretest (WITH tanxpre) as the covariate. The key is the /DESIGN subcommand. Tanxpre is the overall regression between the covariate (tanxpre) and the dependent variable (tanxpost). The interaction between the covariate and the treatment conditions (tanxpre*treatgrp) is the test of interest. It tests whether the slope of the regression line is the same for both treatment conditions. This interaction should be nonsignificant. If the interaction is significant then the slopes for the two groups are not the same and the ANCOVA assumption has been violated. This design can be selected from within the Model window of the GLM procedure. |
|
| The analysis of variance output is shown in Table 2. The only effect
of interest is the interaction between the independent variable (treatgrp) and the
covariate (tanxpre). The interaction is shown in the source row labeled TREATGR*TANXPRE. As we expected from looking at the regression lines, the interaction is not significant, F(1, 74) = 0.001, p > .05. The slope in the treatment group is not significantly different from the slope in the control group. This allows us to go ahead and run the ANCOVA. |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The SPSS syntax for running a simple analysis of covariance is shown in Table 3.
The General Linear Model (GLM) is used to run an ANCOVA. The second line specifies the dependent variable (tanxpost), the keyword BY, the name of the independent variable (treatgrp), the keyword WITH, and the name of the covariate (tanxpre) The /METHOD = SSTYPE(3) subcommand specifies the sums of squares type for the ANOVA. Type 3 is the default. The /INTERCEPT = INCLUDE subcommand is the default. The intercept should be included whenever the overall intercept is not equal to zero. The /EMMEANS =TABLES(treatgrp) subcommand will print the adjusted means for the treatment group main effect. The main effect means are adjusted for the covariate. These are the means that should be reported when you run an ANCOVA. EMMEANS is Estimated Marginal Means. Estimated marginal means can be selected from the Options dialog box. The /CRITERIA = ALPHA(.05) sets the alpha level to .05. This is the default. The /PRINT = DESCRIPTIVE PARAMETER subcommand will print the unadjusted means and a test of homogeneity of variance (descriptives) and estimates of the pooled slope and intercept (parameter). |
|
The ANCOVA source table is shown in Table 4.
| The Corrected Model is not generally of interest. The intercept is a test of whether the overall intercept is different from zero. This is not generally of interest. In this example the intercept test is not significant. The TANXPRE effect is a test of whether the pooled regression is greater than zero. In this example the regression effect is significant, F(1, 75) = 91.56, p < .0005. The pooled slope is not equal to zero. When the regression is significant then you gain power because variance that would have been included in the error term in an ANOVA is now included in the regression term. The TREATGRP is the main effect of treatment condition after removing the effects of the covariate. It is the effect due to treatment after holding constant any pretreatment differences between the two treatment groups. In this example it is significant, F(1, 75) = 35.28, p < .0005. |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Condition | Mean | Std. Deviation | N |
|---|---|---|---|
| Treatment | 38.42 | 13.98 | 40 |
| Control | 44.61 | 12.37 | 38 |
| Total | 41.44 | 13.50 | 78 |
The unadjusted or raw means are shown in Table 5. These Descriptive Statistics are printed when you ask for descriptive statistics.
/PRINT = DESCRIPTIVE
| Condition | Mean | Std. Error |
|---|---|---|
| Treatment | 35.262 | 1.450 |
| Control | 47.935 | 1.490 |
The adjusted means are shown in Table 6. The effects of the independent variable in an ANCOVA are the effects attributed to the differences in the adjusted means. So it would be correct to report the adjusted means rather than the raw means when you run an ANCOVA. The adjusted mean for the treatment condition, M = 35.26 was lower than adjusted mean for the control condition, M = 47.94, F(1, 73) = 35.28, p < .0005.
Adjusted means are computed according to the following formula
| Adj Mean = Yi - B(Xi - X..) |
where Yi is the unadjusted dependent variable mean for group i (the posttest mean in this example), B is the pooled slope between the covariate and the dependent variable, Xi is the unadjusted covariate mean for group i (the pretest mean in this example) and X.. is the overall grand mean of the covariate (the overall mean of the pretest in this example).
| The pooled slope is given is when you ask for parameter estimates, /PRINT
= PARAMETER. Table 7 shows an abbreviated parameter estimate table. The pooled Y
intercept is 7.885 and the pooled slope, B, is .795. If we wanted to compute the adjusted means by hand we would need the pretest means for each group. I haven't been able to find a way to print these means within the ANCOVA model. I computed the pretest means using a repeated measures ANOVA procedure using the pretest and posttest as the repeated measures. The pretest mean for the treatment group was 54.35, the pretest mean for the control group was 46.18. The overall pretest mean was 50.37. |
|
|||||||||||||||||||||||||||||||||||
The computation of the adjusted mean for the treatment group is -
Adj Mean = Yi - B(Xi - X..)
=
38.42 - .795(54.35 - 50.37)
= 38.42 -
3.16
=
35.26
The computation of the adjusted mean for the control group is -
Adj Mean = Yi - B(Xi - X..)
=
44.61 - .795(46.18 - 50.37)
= 44.61 -
(-3.33)
=
47.94
What do we mean when we say that an ANCOVA holds constant differences in pretest scores (or differences in any covariate scores)? Well, if there are pretest differences between the groups then we can hold constant those differences by looking at the predicted posttest scores at any particular pretest score. If we know there are differences in pretest scores between the two groups, we can hold the pretest scores constant by looking at the predicted posttest scores at a particular pretest score of, say, 15, or at a pretest score of, say, 60.
As discussed earlier, you could look at group differences in posttest scores at any particular pretest score. The intercept (pretest score = 0) is convenient because it provides a common reference point across all types of designs and measures. But the intercept is not psychologically meaningful because it is hard to think of mean differences in the posttest scores when a person has a pretest score of zero. For example, suppose you use an intelligence score as a covariate for a study where learning was the dependent variable. How would you interpret learning scores based on intelligence scores of zero?
| The adjusted means are the predicted posttest scores at the overall mean
of the pretest scores (X..). That is, the adjusted means are the posttest
means you would have expected if all of the groups in the study had the same pretest
means. If the pretest means for all the groups were actually the same, then the
pretest mean for each group would be equivalent to the overall pretest mean. The formula for the adjusted means
takes the obtained posttest mean for a group, Yi, and makes an adjustment to it that based on a proportion of distance between that groups pretest score and the overall mean of the pretest scores B(Xi - X..). The proportion that it uses, B, is the pooled slope of the line. |
|
If the slope happened to be 1.00 then the adjustment would be the exact distance from that groups pretest score to the overall mean of the pretest scores. If the slope was 0.8 then the adjustment would be 80% of the distance from that groups pretest score to the overall mean of the pretest scores.
What are the adjusted means when there are no pretest differences between the groups?
If there are no pretest differences between the groups then is it ever worthwhile to run an ANCOVA?
-03/18/99