n1 <- read_csv("https://whlevine.hosted.uark.edu/psyc5143/prob.csv")

# a: generating codes
n1 <- n1 %>% 
    mutate(con1 = ifelse(Text == "S", -2/3, 1/3),
                 con2 = case_when(Text == "S" ~ 0,
                                                 Text == "HE" ~ 1/2,
                                                 Text == "LE" ~ -1/2))

model.n1a <- lm(Y ~ con1 + con2, n1)

# b, c: effect sizes & power analyses
n1b <- modelEffectSizes(model.n1a)
n1b.sr2 <- n1b$Effects[2:3,4] # the effect-size estimates are in column 4; the parameters in rows 2 & 3
unbiased.sr2 <- 1 - (1 - n1b.sr2)*(46/45) # n = 48; PA = 3; PC = 2
f2 <- unbiased.sr2/(1 - unbiased.sr2)
sample1 <- ceiling(pwr.f2.test(u = 1, v = NULL, f2 = f2[1], power = .9)$v) + 1
sample2 <- ceiling(pwr.f2.test(u = 1, v = NULL, f2 = f2[2], power = .9)$v) + 1
# the ceiling function rounds up to the nearest whole number

# d: SSE for the model above
SSEd <- n1b$SSE

# e: intercept-only model and its SSE
model.n1e.int.only <- lm(Y ~ 1, n1)
SSEe <- modelEffectSizes(model.n1e.int.only)$SSE

# f: PRE & F
PRE <- (SSEe - SSEd)/SSEe
Fstat <- (PRE/2)/((1 - PRE)/45)
  1. See the code above. Using a Bonferroni-adjusted \(\alpha = \frac{.05}{2} = .025\), neither contrast is significant. Although the two explanation conditions are scoring higher (\(M = .47\)) than the standard condition (\(M = .40\)) by \(.07\) or so, this advantage is not significant, \(t(45) = 1.36, p = .18\). The second constrast shows an advantage of roughly \(.12\) for the HE condition over the SE condition, but this too is not significant, \(t(45) = 2.02, p = .05\).

  2. The \(sr^2\) values are 0.0364, 0.08. Unbiased, these values are 0.015, 0.0596. Converted to \(f^2\), these values are 0.0152, 0.0634.

  3. For the larger effect size, \(n\) = 167 is the sample size needed to get power = .9; for the smaller effect size, \(n\) = 692 is needed. To get both to power = .9, we’d need the larger of the two sample sizes.

  4. \(SSE\) = 1.3788

  5. \(SSE\) = 1.5604

  6. \(PRE\) = \(\frac{(1.5604 - 1.3788)}{1.5604} = 0.1164\); \(F = \frac{0.1164/2}{(1 - 0.1164)/45} = 2.9647\)

    NOTE: The SSE displayed in the modelEffectSizes table is rounded, but if you store the value in a variable, as I have done above, it will not be rounded!

# g
summary(aov(Y ~ Text, n1))
##             Df Sum Sq Mean Sq F value Pr(>F)  
## Text         2  0.182  0.0908    2.96  0.062 .
## Residuals   45  1.379  0.0306                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Woohoo! But with 2 \(df\) in its numerator, the ANOVA \(F\)-ratio is an agglomerated answer to two (unspecified) research questions, so it bespeaks muddy thinking. I said what I said.


n2 <- read.csv("https://whlevine.hosted.uark.edu/psyc5143/reward.csv")

pairwise.t.test(x = n2$errors, g = n2$condition, p.adjust.method = "none") # a: all pairwise comparisons, raw p-values
pairwise.t.test(x = n2$errors, g = n2$condition, p.adjust.method = "b")    # b: Bonferroni-corrected p
pairwise.t.test(x = n2$errors, g = n2$condition, p.adjust.method = "BH")   # c: BH-adjusted p
TukeyHSD(aov(errors ~ condition, n2))                                      # d: Tukey's HSD
  1. I’ll use letter-pairs to indicate which comparisons are significant: AI, AN, FI, FN

  2. AI, AN, FI, FN

  3. AI, AN, FI, FN

  4. AI, AN, FI, FN

  5. In this case, there is no difference across procedures for which comparisons are significant. Sometimes data are like that. But the p-value differ. From most-powerful (i.e., lowest p-values) to least-powerful, here are the rankings:

Sort of. Because the BH procedure uses a different alpha for each comparison, the p-values for it are lower in some cases than the Tukey HSD and higher in other cases.


Where the Bonferroni (and BH) procedure will potentially be more useful than the Tukey’s HSD procedure is if there are non-pairwise comparisons in the set of those executed. Tukey’s HSD was developed for only pairwise comparisons, which is a limiting factor in its usefulness. Another limiting factor in the usefulness of the HSD procedure is that it involves comparing every possible pairwise comparison, which may not be what is planned. When not all pairwise comparisons are of interest, the Bonferroni & BH procedures will give more power than Tukey’s HSD.

One other drawback of the Tukey’s HSD procedure is that it is - strictly speaking - a post-hoc procedure, one that can only be done if the full ANOVA is significant. It may not be treated that way, but that’s its purpose in life.