d <- read.csv("https://whlevine.hosted.uark.edu/psyc5143/avoidance.csv")
# better labels; this does some of the work for me for part f
d <- d %>%
mutate(areaF = ifelse(area == 1, "neutral", "A"),
delayF = case_when(delay == 1 ~ "50",
delay == 2 ~ "100",
delay == 3 ~ "150"))
# means
d %>% group_by(areaF, delayF) %>% summarize(M = mean(latency))
## # A tibble: 6 × 3
## # Groups: areaF [2]
## areaF delayF M
## <chr> <chr> <dbl>
## 1 A 100 25
## 2 A 150 26
## 3 A 50 14
## 4 neutral 100 25
## 5 neutral 150 25
## 6 neutral 50 26
d %>% group_by(areaF) %>% summarise(M = mean(latency))
## # A tibble: 2 × 2
## areaF M
## <chr> <dbl>
## 1 A 21.7
## 2 neutral 25.3
d %>% group_by(delayF) %>% summarise(M = mean(latency))
## # A tibble: 3 × 2
## delayF M
## <chr> <dbl>
## 1 100 25
## 2 150 25.5
## 3 50 20
# contrast codes
d <- d %>%
mutate(areaC = ifelse(areaF == "neutral", 1/2, -1/2),
delayC1 = ifelse(delayF == "50", -2/3, 1/3),
delayC2 = case_when(delayF == "100" ~ 1/2,
delayF == "150" ~ -1/2,
TRUE ~ 0),
int1 = areaC*delayC1,
int2 = areaC*delayC2)
model <- lm(latency ~ areaC + delayC1 + delayC2 + int1 + int2, d)
coef(model)
## (Intercept) areaC delayC1 delayC2 int1 int2
## 23.500 3.667 5.250 -0.500 -12.500 1.000
summary(model)
anova(model)
## Analysis of Variance Table
##
## Response: latency
## Df Sum Sq Mean Sq F value Pr(>F)
## areaC 1 101 100.8 3.92 0.059 .
## delayC1 1 184 183.8 7.14 0.013 *
## delayC2 1 1 1.2 0.05 0.827
## int1 1 260 260.4 10.11 0.004 **
## int2 1 1 1.2 0.05 0.827
## Residuals 24 618 25.8
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The SSRs for the predictors are above in the Sum Sq column.
d$areaF <- as.factor(d$areaF)
d$delayF <- as.factor(d$delayF)
summary(aov(latency ~ areaF*delayF, d))
For areaC, SSR = 100.83, which is the same as the Sum Sq value for the area factor in the ANOVA. The SSR values for delayC1 and delayC2 sum (183.75 + 1.25 = 185) to the Sum Sq value for the delay factor in the ANOVA. The SSR values for int1 and int2 sum (260.42 + 1.25 = 261.67) to the Sum Sq value for the interaction in the ANOVA.
The code below is overboard, but I want to avoid problems with rounding. What I’ve done is extracted SSR values and the effect-size measures and stashed them in a data frame. I also put SSE in a variable. The I calculated the effect size measures so that you can see them side-by-side. Yeah, it’s overboard, but it keeps me sharp.
# build a data frame by first pulling out the pieces of interest from the modelEffectSizes
d2 <- data.frame(modelEffectSizes(model)$Effects)
# and then get rid of the intercept row
d2 <- d2[-which(rownames(d2) == "(Intercept)"), ]
# I can't quit lmSupport !
# put SSE & SST in variables
SSE <- modelEffectSizes(model)$SSE
SST <- modelEffectSizes(model)$SST
# calculate effect-sizes "by hand" and add them to the data
d2 <- d2 %>%
mutate(partialEta = SSR / (SSR + SSE),
deltaR = SSR / SST)
# show off
d2
## SSR df pEta.sqr dR.sqr partialEta deltaR
## areaC 100.83 1 0.140274 0.086515 0.140274 0.086515
## delayC1 183.75 1 0.229186 0.157658 0.229186 0.157658
## delayC2 1.25 1 0.002019 0.001073 0.002019 0.001073
## int1 260.42 1 0.296461 0.223438 0.296461 0.223438
## int2 1.25 1 0.002019 0.001073 0.002019 0.001073
# dummy codes
d <- d %>%
mutate(areaD = ifelse(areaF == "neutral", 0, 1),
delayD1 = ifelse(delayF == "100", 1, 0),
delayD2 = ifelse(delayF == "150", 1, 0),
intD1 = areaD*delayD1,
intD2 = areaD*delayD2)
modelDummy <- lm(latency ~ areaD+ delayD1 + delayD2 + intD1 + intD2, d)
coef(modelDummy)
Now the parameter estimates are about cell means. The intercept is the mean of the double-reference group (neutral-50). The areaD slope (-12) is the difference between the means of neutral-50 (26) and Area A-50 (14). The delayD1 slope about the difference between the mean of neutral-50 (26) and neutral-100 (25); the delayD2 slope is about the difference between the mean of neutral-50 (26) and neutral-150 (25). The intD1 slope is the difference between difference in the neutral-50 mean (26) and the neutral-100 mean (25) - which is -1 - and the same difference in Area A (14 - 25 = -11); and the intD2 slope is the corresponding difference in the comparison of the 50 vs 150 msec conditions.
ps8 <- read.csv("https://whlevine.hosted.uark.edu/psyc5143/ps8.csv", stringsAsFactors = T)
# a
# cell means & B means
ps8 %>% group_by(A, B) %>% summarise(M = mean(DV))
ps8 %>% group_by(B) %>% summarise(M = mean(DV))
# i, j, k
lm(DV ~ dummyA1*conB, ps8) %>% coef()
lm(DV ~ dummyA2*conB, ps8) %>% coef()
lm(DV ~ conA*conB, ps8) %>% coef()
# A1 A2
# B1 10 14 12
# B2 12 20 16
The main effect of B is 16 - 12 = 4.
The simple effect of B at A1 is 12 - 10 = 2.
The simple effect of B at A2 is 20 - 14 = 6.
The average of the two simple effect is (6 + 2) / 2 = 4, which is the main effect.
Held constant at zero for dummyA1 means
group (or level) A1.
Held constant at zero for dummyA2 means
group (or level) A2.
Held constant at zero for conA means
neither group A1 or A2, or, in a sense, the average of the two.
The conB slope is 2, which matches the simple effect of B at A1 from part c.
The conB slope is 6, which matches the simple effect of B at A2 from part d.
The conB slope is 4, which matches the average of the two simple effects of B, or its main effect, from part b.
MAIN EFFECTS ARE AVERAGES OF SIMPLE EFFECTS! Equals zero for a contrast-coded factor is about the average for the factor, which is zero, or the main effect!