---
title: "Problem Set 3 Answer Key"
output: html_document
---

```{r setup, include = FALSE}
options(digits = 4, scipen = 999)
knitr::opts_chunk$set(echo = TRUE, fig.width = 4, fig.height = 4)
```

```{r, results='hide', message = FALSE}
library(tidyverse)
library(lmSupport)
library(psych)
library(car)
library(kableExtra)
options(knitr.kable.NA = '')
```

# 1
```{r, results='hide'}
reward <- read.csv("https://whlevine.hosted.uark.edu/psyc5143/ps4.csv")
group_means <- reward %>% 
	group_by(COND) %>% 
	summarise(M = mean(TRIALS))
grand_mean <- mean(reward$TRIALS)
```

a) The groups means are `r t(group_means[2])` for the double-deprived, food-deprived, control, and water-deprived groups respectively, and the grand mean = `r grand_mean`; the latter is the same as the mean of the group means because the groups are of equal size.

b) The codes below are implemented in the script a bit more below.

contrast i: control = 3/4, others = -1/4

contrast ii: double-deprived = -2/3, single-deprived = 1/3

contrast iii: food-deprived = 1/2, water-deprived = -1/2

```{r, results='hide'}
reward <- reward %>% 
	mutate(x1 = ifelse(group == "b", 3/4, -1/4),
				 x2 = case_when(group == "b" ~ 0,
				 								 group == "c" ~ 1/3,
				 								 group == "d" ~ 1/3,
				 								 group == "e" ~ -2/3),
				 x3 = case_when(group == "b" ~ 0,
				 								 group == "c" ~ 1/2,
				 								 group == "d" ~ -1/2,
				 								 group == "e" ~ 0))
```

c) The contrasts all have a correlation of 0 with one another; so they're orthogonal. I'll also find tolerance for these after fitting the model. It will be 1 for each predictor.

```{r}
cor(reward[4:6])
```

d)
```{r}
model1.contrasts <- lm(TRIALS ~ x1 + x2 + x3, reward)
params1 <- coef(model1.contrasts)
```

The intercept of `r params1[1]` is the mean of the group means. The slope of the first contrast is `r params1[2]`, which equals the difference between the mean of the control group (24) and the mean of the three reward groups (i.e., (8 + 12 + 12) / 3 = 10.67). The slope of the second contrast is `r params1[2]`, which equals the difference between the mean of the two singly-deprived groups (12) and the doubly-deprived group (8). And the slope of the last contrast is `r params1[3]`, which is the difference in the means of the two singly-deprived groups.

And look at the tolerances below!

```{r}
1/car::vif(model1.contrasts) # 1/vif is tolerance
```


e)
```{r, results='hide'}
R2_model1.contrasts <- modelSummary(model1.contrasts)$r.squared
```

$R^2$ = `r R2_model1.contrasts`

f)
```{r}
modelEffectSizes(model1.contrasts)
```

The three $sr^2$ values do sum to $R^2$. This is because the three contrasts fully code the differences among the three groups *and* they are orthogonal.

g) See below.

```{r}
reward <- reward %>% 
	mutate(dummy1 = ifelse(group == "c", 1, 0),
				 dummy2 = ifelse(group == "d", 1, 0),
				 dummy3 = ifelse(group == "e", 1, 0))
```

h)
```{r}
model1.dummy <- lm(TRIALS ~ dummy1 + dummy2 + dummy3, reward)
params1dummy <- coef(model1.dummy)
```

The intercept is `r params1dummy[1]`, which is the control group mean. Each of the other slopes is the deviation of the other group means from this value.

i)
```{r, results='hide'}
R2_model1.dummy <- modelSummary(model1.dummy)$r.squared
```

$R^2$ = `r R2_model1.dummy`

j)
```{r}
modelEffectSizes(model1.dummy)
```

The three $sr^2$ values do not sum to $R^2$. This is because dummy codes are not contrasts (i.e., their $\lambda$s do not sum to 0). This characteristic of $sr^2$ summing to $R^2$ will happen only for *contrasts* that are *orthogonal*; both features have to be present. I have to relearn this occasionally. 

k)
```{r}
coef(lm(TRIALS ~ COND, reward))
```

The intercept is the mean of the food and water deprived group and the slopes are deviations of other groups means from that value.

l&m)
```{r, results='hide'}
reward <- reward %>% 
	mutate(con4 = ifelse(group == "b", 3, -1),
				 con5 = case_when(group == "b" ~ 0,
				 								 group == "c" ~ 1,
				 								 group == "d" ~ 1,
				 								 group == "e" ~ -2),
				 con6 = case_when(group == "b" ~ 0,
				 								 group == "c" ~ 1,
				 								 group == "d" ~ -1,
				 								 group == "e" ~ 0))

modelSummary(lm(TRIALS ~ con4 + con5 + con6, reward))
```

The intercept is the same, but the slopes are fractions of what they were previously, 1/4 for contrast1, 1/3 for contrast2, and 1/2 for contrast4, although it's hard to see that the latter is true because the slope is 0.

# 2

a)
```{r}
n2 <- read.csv("https://whlevine.hosted.uark.edu/psyc5143/threeGroups.csv")

n2groupMs <- n2 %>% group_by(group) %>% summarise(M = mean(memory))
n2grandM <- colMeans(n2groupMs[2])
```

The group means are `r t(n2groupMs[2])` and the grand mean is `r n2grandM`.

b) I'll go with control vs the two treatment groups and the treatment groups vs one another.

```{r}
n2 <- n2 %>% 
	mutate(con1 = ifelse(group == "control", -2/3, 1/3),
				 con2 = case_when(group == "control" ~ 0,
				 								 group == "rhyme" ~ 1/2,
				 								 group == "image" ~ -1/2))

model2 <- lm(memory ~ con1 + con2, n2)
params2 <- coef(model2)
SEs <- summary(model2)$coefficients[,2]
n2R2 <- summary(model2)$r.squared
```

The intercept and slopes are `r params2`, which are, respectively, the grand mean, the advantage for the two treatment group over the control, and the difference between the two treatment groups.

The *SE*s for the slopes are `r SEs[2:3]`. And $R^2$ for the model = `r n2R2`.

c)
```{r}
model2weird <- lm(memory ~ con1, n2)
params2c <- coef(model2)
n2cSEs <- summary(model2weird)$coefficients[,2]
n2cR2 <- summary(model2weird)$r.squared
```

The intercept and slope are the same as they were for part b, although there is no slope for the difference in the two treatment groups.

The *SE* for the slope is `r n2cSEs[2]`. And $R^2$ for the model = `r n2cR2`.

d) The *SE* went up a bit and $R^2$ went down. The reason is that what is considered residual error in the model in part c includes what was part of the explained variability of con2 in part b. More residual error means bigger *SE* and less explained variability. Look at the tables below and you'll see that SSR for the model with only one contrast is exactly lower than the one with two contrast by the amount the SSR for the missing contrast. In this instance, SSR for the model with one contrast is 402; for the model with two contrasts it's 422. The SSR for con2 in the model with two contrast is 20. This is not a coincidence.

```{r}
modelEffectSizes(model2)
modelEffectSizes(model2weird)
```

# 3

```{r, results='hide'}
n3 <- read.csv("https://whlevine.hosted.uark.edu/psyc5143/p4n3.csv")
is.factor(n3$groupF) # check!

# a, b, c
model3a <- lm(dv ~ groupF, n3); anova(model3a)
model3b <- lm(dv ~ group1, n3); anova(model3b)
model3c <- lm(dv ~ group2, n3); anova(model3c)
```

a) Here, *F*(2, 12) = 10.7 and *SSE* = 120.

b) Here, *F*(1, 13) = 0 and *SSE* = 333.

c) Here, *F*(1, 13) = 12 and *SSE* = 173.

d) The lowest SSE goes with the model in part a. One reason that this is the case is that this model has more parameters; numerator *df*s provide critical information about the number of parameters the differ between Model C and Model A. Less obvious is that two linear relationships are being fit in this model, one for each of the underlying dummy codes, revealed by the R code below. In parts b and c, only one linear relationship is being evaluated. The consequence of the the latter is that this forces the use of the same slope to be used to accommodate the difference between two pairs of groups, which is very limiting.

Maybe the simplest way to say why group1 and group2 lead to higher values of SSE is to say that if you want to model differences between more than two groups, you'll need at least two slopes to do so. Using one slope to do so is going to lead to suboptimal predictions.

e) The models in parts b and c are treating group numbers as if the numbers having quantitative meaning. They don’t. They may as well be letters rather than numbers. **THIS IS THE MORAL OF THE FABLE: IF YOU GET DATA WITH NUMBERS IDENTIFYING GROUPS, BE SURE TO TREAT THOSE NUMBERS AS LABELS RATHER THAN AS QUANTITATIVE VALUES.**