library(tidyverse)
library(lmSupport)
library(psych)
library(car)
library(kableExtra)
options(knitr.kable.NA = '')

1

Model C: predicted donation = overall mean

Model A: predicted donation = mean of group means + condition effect

\(H_0: \mu_{legacy} = \mu_{control}\) or \(\beta_1 = 0\)

n1 <- read.csv("https://whlevine.hosted.uark.edu/psyc5143/legacy.csv")
n1 %>% 
    group_by(group) %>% 
    summarise(M = mean(donation))
## # A tibble: 2 × 2
##   group       M
##   <chr>   <dbl>
## 1 control   2  
## 2 prime     3.6
  1. The slope should be the difference in means (i.e., 1.6) and the intercept should be the mean of the group means (i.e., 2.8).

n1 <- n1 %>% 
    mutate(con1 = ifelse(group == "control", -1/2, 1/2))

# Model C
coef(lm(donation ~ 1, n1))
## (Intercept) 
##         2.8
# Model A
coef(lm(donation ~ con1, n1))
## (Intercept)        con1 
##         2.8         1.6

Match!

  1. Match!

\(\hat{donation}_{control} = 2.8 + 1.6\times(-0.5) = 2.0\)

\(\hat{donation}_{legacy} = 2.8 + 1.6\times(0.5) = 3.6\)

  1. The intercept is the mean of the group means (and the grand mean as well, but only because the group sizes are equal). The slope is the difference between the group means.

modelSummary(lm(donation ~ con1, n1)) -> n1summary
## lm(formula = donation ~ con1, data = n1)
## Observations: 20
## 
## Linear model fit by least squares
## 
## Coefficients:
##             Estimate     SE     t Pr(>|t|)    
## (Intercept)   2.8000 0.2809 9.969 9.37e-09 ***
## con1          1.6000 0.5617 2.848   0.0107 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Sum of squared errors (SSE): 28.4, Error df: 18
## R-squared:  0.3107
aggregate(donation ~ group, n1, mean) -> n1means

The legacy-primed group gave significantly greater donations (M = $3.6) than the control group (M = $2), t(18) = 2.848, p = 0.011.

confint(lm(donation ~ con1, n1)) -> n1ci

The 95% CI for the slope (i.e., the group mean difference) is [0.42, 2.78].

2

n2 <- read.csv("https://whlevine.hosted.uark.edu/psyc5143/ps3.csv")
n2means <- n2 %>% group_by(group) %>% summarise(M = mean(Y))
n2meandiff <- 22.4 - 19.9
n2meanofmeans <- (22.4 + 19.9)/2

n2 <- n2 %>% 
    mutate(b = ifelse(group == "prime", 1, -1),
                 c = ifelse(group == "prime", 1/2, -1/2),
                 d = ifelse(group == "prime", 1, 0),
                 e = ifelse(group == "prime", -1, 0),
                 f = ifelse(group == "prime", 5, 3))

n2b <- coef(lm(Y ~ b, n2))
n2c <- coef(lm(Y ~ c, n2))
n2d <- coef(lm(Y ~ d, n2))
n2e <- coef(lm(Y ~ e, n2))
n2f <- coef(lm(Y ~ f, n2))
  1. The group means are 22.4 and 19.9 for the “prime” and “control” groups, respectively. These differ by 2.5 and the mean of these two values is 21.15.

  2. The intercept and slope are 21.15, 1.25, respectively. These are the mean of the group means and half the difference between the group means.

  3. The intercept and slope are 21.15, 2.5, respectively. These are the mean of the group means and the difference between the group means.

  4. The intercept and slope are 19.9, 2.5, respectively. These are the control group mean and the difference between the group means (prime - control).

  5. The intercept and slope are 19.9, -2.5, respectively. These are the control group mean and the difference between the group means (control - prime).

  6. The intercept and slope are 16.15, 1.25, respectively. The y-intercept isn’t especially interpretable here, but the slope is half the difference between the group means (because the difference between the codes is 2, just like \(\pm1\)).

3

n3 <- read.csv("https://whlevine.hosted.uark.edu/psyc5143/unequal.csv")

# a: group means, n, overall mean, mean of means
n3 %>% group_by(group) %>% 
    summarise(M = mean(Y),         # 20, 12
                        n = length(Y)) %>%   # 9, 3
    ungroup()

# mean of means = 16

# grand/overall mean 
mean(n3$Y) # 18

# b
coef(lm(Y ~ X, n3)) # b0 = 16 (the mean of the means)

# c
9 * (20 - 18)^2 + 3 * (12 - 18)^2 # SS1 = 144
9 * (20 - 16)^2 + 3 * (12 - 16)^2 # SS2 = 192
anova(lm(Y ~ X, n3))              # SSR = 144
  1. The groups means are 20 and 12. There are 9 and 3 observations in the two groups that go with these means, respectively. The overall (grand) mean is 18. The mean of the group means (20 and 12) is 16.

  2. The intercept for this model is equal to the mean of the means and not the overall/grand mean.

  3. SSR for the model in part b is 144, which is equal to “SS1”. So, despite that the intercept in the augmented model is the mean of the group means, the improvement in that model is relative to a compact model that uses the overall/grand mean to make predictions.