Chapter 12 Analysis of Covariance (ANCOVA)
In various cases, researchers are interested in comparing treatments after adjusting for a numeric predictor that is associated with an experimental unit. That is often, but not necessarily, a baseline pre-treatment score on the unit. The goal is to determine whether treatment effects are present after controlling for the covariate.
When the Analysis of Covariance was first developed, the computations used were very complicated. Now, it’s just as simple as fitting a regression with one or more predictors and dummy variables for the treatment conditions. Models can have one or more factors. We will consider the case with one treatment factor and one covariate.
12.1 Additive Model
In the additive model, the slope relating the covariate to the response is assumed to be the same for each treatment. As with the 1-Way ANOVA model, we will use \(r\) as the number of treatments with \(n_i\) replicates for treatment \(i\). We will use centered \(X\) values for the covariate for ease of interpretation when describing adjusted means. We will also include \(r-1\) indicators for treatments \(1,\ldots,r-1\) assuming again that treatment effects sum to 0.
\[ Y_{ij}= \mu_{\bullet} + \tau_1 I_{ij1} + \cdots + \tau_{r-1}I_{ij,r-1} + \gamma \left(X_{ij}-\overline{X}_{\bullet\bullet}\right) +\epsilon_{ijk} \]
\[ \epsilon_{ijk} \sim NID\left(0,\sigma^2\right) \qquad \sum_{i=1}^r\tau_i=0 \quad \Rightarrow \quad \tau_r=-\sum_{i=1}^{r-1}\tau_i \] \[ I_{ij1} = \left\{ \begin{array}{r@{\quad:\quad}l} 1 & i=1 \\ 0 & i=2,\ldots,r-1 \\ -1 & i=r\\ \end{array} \right. \qquad \cdots \qquad I_{ij,r-1} = \left\{ \begin{array}{r@{\quad:\quad}l} 1 & i=r-1 \\ 0 & i=1,\ldots,r-2 \\ -1 & i=r\\ \end{array} \right. \]
To test for treatment effects, controlling for the predictor, we test \(H_0:\tau_1=\cdots = \tau_r=0\). This easily conducted as a general linear test, where the full model contains the \(r-1\) treatment indicator variables and the centered covariate. The reduced model contains only the centered covariate. The test is as follows.
\[ TS: F^*=\frac{\left[\frac{SSE(R)-SSE(F)}{df_R-df_F}\right]}{ \left[\frac{SSE(F)}{df_F}\right] } \qquad RR:F^* \geq F_{1-\alpha;df_R-df_F,df_F} \] Note that for the current case, \(df_R=n_T-2\) and \(df_F=n_T-2-(r-1)\).
12.1.1 Example - Comparison of Skin Softeners
A study compared two treatments and a control \(r=3\) for skin softening (Ma’Or, Yehuda, and Voss 1997). The active treatments were as follows, and the covariate was the pre-treatment softness score for the subjects. There were 20 subjects per treatment, so \(n_T=3(20)=60\). The response is skin roughness (lower scores are better).
- Treatment 1 - Gel Formulation
- Treatment 2 - Gel Formulation + 1% Dead Sea Concentrate
- Treatment 3 - Placebo
The R code and output are given below.
## trt_Grp pre_x post_y gel gelDS
## 1 1 121.10 76.30 1 0
## 2 1 375.40 265.10 1 0
## 3 1 138.05 82.92 1 0
## 4 1 114.85 91.75 1 0
## 5 1 263.91 175.32 1 0
## 6 1 255.88 152.54 1 0
## trt_Grp pre_x post_y gel gelDS
## 55 3 127.57 112.80 0 0
## 56 3 138.97 116.75 0 0
## 57 3 269.69 255.30 0 0
## 58 3 138.39 132.16 0 0
## 59 3 138.59 107.22 0 0
## 60 3 134.89 125.20 0 0
dsm$I1 <- ifelse(dsm$trt_Grp==1,1,ifelse(dsm$trt_Grp==3,-1,0))
dsm$I2 <- ifelse(dsm$trt_Grp==2,1,ifelse(dsm$trt_Grp==3,-1,0))
dsm$Xc <- dsm$pre_x - mean(dsm$pre_x)
dsm.mod1 <- lm(post_y ~ I1 + I2 + Xc, data=dsm)
summary(dsm.mod1)
##
## Call:
## lm(formula = post_y ~ I1 + I2 + Xc, data = dsm)
##
## Residuals:
## Min 1Q Median 3Q Max
## -31.13 -10.35 -3.93 12.29 35.34
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 137.53667 2.08924 65.831 < 2e-16 ***
## I1 -2.66382 2.95473 -0.902 0.371
## I2 -26.91735 2.95595 -9.106 1.22e-12 ***
## Xc 0.70732 0.03257 21.716 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 16.18 on 56 degrees of freedom
## Multiple R-squared: 0.9121, Adjusted R-squared: 0.9073
## F-statistic: 193.6 on 3 and 56 DF, p-value: < 2.2e-16
## Analysis of Variance Table
##
## Response: post_y
## Df Sum Sq Mean Sq F value Pr(>F)
## I1 1 9853 9853 37.622 9.205e-08 ***
## I2 1 18744 18744 71.572 1.361e-11 ***
## Xc 1 123501 123501 471.566 < 2.2e-16 ***
## Residuals 56 14666 262
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Call:
## lm(formula = post_y ~ Xc, data = dsm)
##
## Residuals:
## Min 1Q Median 3Q Max
## -50.220 -19.668 -4.885 21.923 63.420
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 137.53667 3.66612 37.52 <2e-16 ***
## Xc 0.69686 0.05713 12.20 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 28.4 on 58 degrees of freedom
## Multiple R-squared: 0.7195, Adjusted R-squared: 0.7147
## F-statistic: 148.8 on 1 and 58 DF, p-value: < 2.2e-16
## Analysis of Variance Table
##
## Response: post_y
## Df Sum Sq Mean Sq F value Pr(>F)
## Xc 1 119991 119991 148.79 < 2.2e-16 ***
## Residuals 58 46773 806
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Analysis of Variance Table
##
## Model 1: post_y ~ Xc
## Model 2: post_y ~ I1 + I2 + Xc
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 58 46773
## 2 56 14666 2 32107 61.297 7.888e-15 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -150.41 -48.35 -19.85 0.00 42.88 188.51
Xc.grid <- seq(-200,200,0.1)
plot(dsm$post_y ~ dsm$Xc, pch=16, col=dsm$trt_Grp, cex=0.6,
xlab="Pre-Treatment Score", ylab="Post-Treatment Score",
xlim=c(-200,200), ylim=c(0,320), main="Skin Roughness")
lines(Xc.grid,
dsm.mod1$coef[1]+dsm.mod1$coef[2]+Xc.grid*dsm.mod1$coef[4],
col=1)
lines(Xc.grid,
dsm.mod1$coef[1]+dsm.mod1$coef[3]+Xc.grid*dsm.mod1$coef[4],
col=2)
lines(Xc.grid,
dsm.mod1$coef[1]-dsm.mod1$coef[2]-dsm.mod1$coef[3]+Xc.grid*dsm.mod1$coef[4],
col=3)
legend("topleft", c("Gel", "Gel+DeadSea","Placebo"), col=1:3, lty=1)
The \(F^*\)-statistic is 61.297 with 2 and 56 degrees of freedm and \(P<.0001\). There is strong evidence of differences among the treatments, controlling for baseline score.
\[\nabla \]
12.1.2 Adjusted means
Where the lines cross at the centered X level of 0 are the adjusted meabs. That is, they are the predicted scores for the treatments at the “average” value of the uncentered covariate. Based on this additive model, we obtain the following adjusted means.
\[ i=1,\ldots , r-1:\quad \hat{\mu}_i = \hat{\mu}_{\bullet} +\hat{\tau}_i \qquad \hat{\mu}_r=\hat{\mu}_{\bullet} -\sum_{i=1}^{r-1}\hat{\tau}_i \]
The goal is to typically compare the adjusted means, as would be done for the 1-Way ANOVA. The standard errors depend on the variance-covariance matrix of the regression coefficients, which is obtained with the vcov function from the regression model fit.
\[ i,i' < r \quad s\left\{\hat{\tau}_i-\hat{\tau}_{i'}\right\} = \sqrt{s^2\left\{\hat{\tau}_i\right\}+s^2\left\{\hat{\tau}_{i'}\right\} -2s\left\{\hat{\tau}_i,\hat{\tau}_{i'}\right\}} \]
When comparing the first \(r-1\) treatments with the \(r^{th}\), we use the follwing result.
\[ \hat{\tau}_i-\hat{\tau}_r=\hat{\tau}_i-\left(-\sum_{i=1}^{r-1}\hat{\tau}_i\right)= 2\hat{\tau}_i + \sum_{\stackrel{i'=1}{i'\ne i}}^{r-1}\hat{\tau}_{i'} \]
Then the (messy) standard errors for these differences are as follow.
\[ s\left\{\hat{\tau}_i-\hat{\tau}_r\right\} = \sqrt{4s^2\left\{\hat{\tau}_i\right\} + \sum_{\stackrel{i'=1}{i'\ne i}}^{r-1}s^2\left\{\hat{\tau}_{i'}\right\} +4\sum_{\stackrel{i'=1}{i'\ne i}}^{r-1}s\left\{\hat{\tau}_i,\hat{\tau}_{i'}\right\} +2\sum_{\stackrel{i'=1}{i'\ne i}}^{r-2}\sum_{\stackrel{i''=i'+1}{i''\ne i}}^{r-1} } \]
12.1.3 Example - Comparison of Skin Softeners
We obtain the adjusted means and their differences along with multiple comparisons with the following R code.
## trt_Grp pre_x post_y gel gelDS
## 1 1 121.10 76.30 1 0
## 2 1 375.40 265.10 1 0
## 3 1 138.05 82.92 1 0
## 4 1 114.85 91.75 1 0
## 5 1 263.91 175.32 1 0
## 6 1 255.88 152.54 1 0
## trt_Grp pre_x post_y gel gelDS
## 55 3 127.57 112.80 0 0
## 56 3 138.97 116.75 0 0
## 57 3 269.69 255.30 0 0
## 58 3 138.39 132.16 0 0
## 59 3 138.59 107.22 0 0
## 60 3 134.89 125.20 0 0
dsm$I1 <- ifelse(dsm$trt_Grp==1,1,ifelse(dsm$trt_Grp==3,-1,0))
dsm$I2 <- ifelse(dsm$trt_Grp==2,1,ifelse(dsm$trt_Grp==3,-1,0))
dsm$Xc <- dsm$pre_x - mean(dsm$pre_x)
dsm.mod1 <- lm(post_y ~ I1 + I2 + Xc, data=dsm)
summary(dsm.mod1)
##
## Call:
## lm(formula = post_y ~ I1 + I2 + Xc, data = dsm)
##
## Residuals:
## Min 1Q Median 3Q Max
## -31.13 -10.35 -3.93 12.29 35.34
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 137.53667 2.08924 65.831 < 2e-16 ***
## I1 -2.66382 2.95473 -0.902 0.371
## I2 -26.91735 2.95595 -9.106 1.22e-12 ***
## Xc 0.70732 0.03257 21.716 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 16.18 on 56 degrees of freedom
## Multiple R-squared: 0.9121, Adjusted R-squared: 0.9073
## F-statistic: 193.6 on 3 and 56 DF, p-value: < 2.2e-16
## Analysis of Variance Table
##
## Response: post_y
## Df Sum Sq Mean Sq F value Pr(>F)
## I1 1 9853 9853 37.622 9.205e-08 ***
## I2 1 18744 18744 71.572 1.361e-11 ***
## Xc 1 123501 123501 471.566 < 2.2e-16 ***
## Residuals 56 14666 262
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
vcov1 <- vcov(dsm.mod1)
r <- length(unique(dsm$trt_Grp))
adjmean <- rep(0,r)
for (i1 in 1:(r-1)) adjmean[i1] <- dsm.mod1$coef[1] + dsm.mod1$coef[i1+1]
adjmean[r] <- dsm.mod1$coef[1]-sum(dsm.mod1$coef[2:r])
adjmean
## [1] 134.8728 110.6193 167.1178
se.diff <- rep(0,r*(r-1)/2)
se.diff[1] <- sqrt(vcov1[2,2]+vcov1[3,3]-2*vcov1[2,3])
se.diff[2] <- sqrt(4*vcov1[2,2]+vcov1[3,3]+4*vcov1[2,3])
se.diff[3] <- sqrt(4*vcov1[3,3]+vcov1[2,2]+4*vcov1[2,3])
HSD.q <- qtukey(.95,r,df.residual(dsm.mod1))
HSD.out <- matrix(rep(0,8*r*(r-1)/2), ncol=8)
row.count <- 0
for (i1 in 2:r) {
for (i2 in 1:(i1-1)) {
row.count <- row.count + 1
HSD.out[row.count,1] <- i1
HSD.out[row.count,2] <- i2
HSD.out[row.count,3] <- adjmean[i1]
HSD.out[row.count,4] <- adjmean[i2]
HSD.out[row.count,5] <- adjmean[i1] - adjmean[i2]
HSD.out[row.count,6] <-
(adjmean[i1] - adjmean[i2]) - HSD.q*se.diff[row.count]
HSD.out[row.count,7] <-
(adjmean[i1] - adjmean[i2]) + HSD.q*se.diff[row.count]
HSD.out[row.count,8] <-
1-ptukey(sqrt(2)*abs(adjmean[i1] - adjmean[i2])/se.diff[row.count],
r,df.residual(dsm.mod1))
}
}
colnames(HSD.out) <- c("i", "i'", "AdjMn_i", "AdjMn_i'", "Diff", "LB", "UB", "p adj")
round(HSD.out,4)
## i i' AdjMn_i AdjMn_i' Diff LB UB p adj
## [1,] 2 1 110.6193 134.8728 -24.2535 -41.6821 -6.8250 0
## [2,] 3 1 167.1178 134.8728 32.2450 14.8202 49.6698 0
## [3,] 3 2 167.1178 110.6193 56.4985 39.0665 73.9306 0
All three pairs of treatments are significantly different. Treatment 2 (Gel+DeadSea) has significantly lower roughness scores than the other treatments and Gel has significantly lower scores than Placebo.
\[\nabla \]
12.2 Interaction Model
When the slope relating the response to the covariate differs among treatments, the model contains interactions. This can be tested by including cross-product terms for the treatment indicator variables and the centered covariate. Then the full model containing these extra terms can be compared with the additive (reduced) model. The intraction model is given below.
\[ Y_{ij}= \mu_{\bullet} + \sum_{k=1}^{r-1}\tau_k I_{ijk} + \gamma \left(X_{ij}-\overline{X}_{\bullet\bullet}\right) + \sum_{k=1}^{r-1}\beta_kI_{ijk}\left(X_{ij}-\overline{X}_{\bullet\bullet}\right) + \epsilon_{ijk} \]
Comparing adjusted means is more difficult when there is an interaction, as the difference among treatment means differs at different \(X\) levels (unlike the additive model). A method was developed by Johnson and Neyman to find regions of \(X\) values where two treatments are significantly different and where they are not. We will not pursue that method here, but will run the general linear test for interactions.
12.2.1 Example - Comparison of Skin Softeners
We obtain the adjusted means and their differences along with multiple comparisons with the following R code.
## trt_Grp pre_x post_y gel gelDS
## 1 1 121.10 76.30 1 0
## 2 1 375.40 265.10 1 0
## 3 1 138.05 82.92 1 0
## 4 1 114.85 91.75 1 0
## 5 1 263.91 175.32 1 0
## 6 1 255.88 152.54 1 0
## trt_Grp pre_x post_y gel gelDS
## 55 3 127.57 112.80 0 0
## 56 3 138.97 116.75 0 0
## 57 3 269.69 255.30 0 0
## 58 3 138.39 132.16 0 0
## 59 3 138.59 107.22 0 0
## 60 3 134.89 125.20 0 0
dsm$I1 <- ifelse(dsm$trt_Grp==1,1,ifelse(dsm$trt_Grp==3,-1,0))
dsm$I2 <- ifelse(dsm$trt_Grp==2,1,ifelse(dsm$trt_Grp==3,-1,0))
dsm$Xc <- dsm$pre_x - mean(dsm$pre_x)
dsm.mod1 <- lm(post_y ~ I1 + I2 + Xc, data=dsm)
summary(dsm.mod1)
##
## Call:
## lm(formula = post_y ~ I1 + I2 + Xc, data = dsm)
##
## Residuals:
## Min 1Q Median 3Q Max
## -31.13 -10.35 -3.93 12.29 35.34
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 137.53667 2.08924 65.831 < 2e-16 ***
## I1 -2.66382 2.95473 -0.902 0.371
## I2 -26.91735 2.95595 -9.106 1.22e-12 ***
## Xc 0.70732 0.03257 21.716 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 16.18 on 56 degrees of freedom
## Multiple R-squared: 0.9121, Adjusted R-squared: 0.9073
## F-statistic: 193.6 on 3 and 56 DF, p-value: < 2.2e-16
## Analysis of Variance Table
##
## Response: post_y
## Df Sum Sq Mean Sq F value Pr(>F)
## I1 1 9853 9853 37.622 9.205e-08 ***
## I2 1 18744 18744 71.572 1.361e-11 ***
## Xc 1 123501 123501 471.566 < 2.2e-16 ***
## Residuals 56 14666 262
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Call:
## lm(formula = post_y ~ I1 + I2 + Xc + I(I1 * Xc) + I(I2 * Xc),
## data = dsm)
##
## Residuals:
## Min 1Q Median 3Q Max
## -28.385 -7.771 0.046 7.394 35.485
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 137.83352 1.64131 83.978 < 2e-16 ***
## I1 -2.99032 2.32058 -1.289 0.2030
## I2 -26.78674 2.32170 -11.538 3.41e-16 ***
## Xc 0.72542 0.02632 27.564 < 2e-16 ***
## I(I1 * Xc) -0.05744 0.03431 -1.674 0.0999 .
## I(I2 * Xc) -0.17549 0.03784 -4.638 2.28e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 12.71 on 54 degrees of freedom
## Multiple R-squared: 0.9477, Adjusted R-squared: 0.9429
## F-statistic: 195.8 on 5 and 54 DF, p-value: < 2.2e-16
## Analysis of Variance Table
##
## Response: post_y
## Df Sum Sq Mean Sq F value Pr(>F)
## I1 1 9853 9853 61.026 1.985e-10 ***
## I2 1 18744 18744 116.096 4.560e-15 ***
## Xc 1 123501 123501 764.925 < 2.2e-16 ***
## I(I1 * Xc) 1 2475 2475 15.330 0.0002553 ***
## I(I2 * Xc) 1 3472 3472 21.507 2.279e-05 ***
## Residuals 54 8719 161
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Analysis of Variance Table
##
## Model 1: post_y ~ I1 + I2 + Xc
## Model 2: post_y ~ I1 + I2 + Xc + I(I1 * Xc) + I(I2 * Xc)
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 56 14666.1
## 2 54 8718.5 2 5947.5 18.419 7.971e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Xc.grid <- seq(-200,200,0.1)
plot(dsm$post_y ~ dsm$Xc, pch=16, col=dsm$trt_Grp, cex=0.6,
xlab="Pre-Treatment Score", ylab="Post-Treatment Score",
xlim=c(-200,200), ylim=c(0,320), main="Skin Roughness")
lines(Xc.grid,
dsm.mod2$coef[1]+dsm.mod2$coef[2]+Xc.grid*dsm.mod2$coef[4]
+dsm.mod2$coef[5]*Xc.grid,
col=1)
lines(Xc.grid,
dsm.mod2$coef[1]+dsm.mod2$coef[3]+Xc.grid*dsm.mod2$coef[4]
+dsm.mod2$coef[6]*Xc.grid,
col=2)
lines(Xc.grid,
dsm.mod2$coef[1]-dsm.mod2$coef[2]-dsm.mod2$coef[3]+
Xc.grid*dsm.mod2$coef[4]-dsm.mod2$coef[5]*Xc.grid-
dsm.mod2$coef[6]*Xc.grid,
col=3)
legend("topleft", c("Gel", "Gel+DeadSea","Placebo"), col=1:3, lty=1)
The interaction is significant \((F^*=18.42, P<.0001)\). Based on the plot, there appears to be larger treatment differences among patients with higher baseline skin roughness.
\[\nabla \]