library(tidyverse)
library(lmSupport)
library(psych)
library(car)
options(knitr.kable.NA = '')

1

The relationship depicted below is non-monotonic. Darts performance gets better (scores get lower) with increasing BAC until BAC \(\approx\) .06, at which point it gets worse with increasing BAC.

darts <- read_csv("https://whlevine.hosted.uark.edu/psyc5143/darts.csv")

ggplot(darts, aes(x = BAC, y = Darts)) +
    geom_point() +
    geom_smooth(method = 'loess', se = F)

2

darts <- darts %>% 
    mutate(BAC.c = BAC - mean(BAC))

BAC.m <- mean(darts$BAC)

linear.model <- lm(Darts ~ BAC.c, darts)

coef(linear.model) -> linear.parameters # storing the parameter estimates to
                                        # "pipe" them into the answer

The y-intercept is 36.46, which is the predicted Cricket score for someone with a mean BAC of 0.079. The slope of the linear-only model is 49.51, which is the value by which Cricket scores are expected to increase per 1 percentage-point increase in BAC. To make this easier to interpret, we can divide the slope by 100, making it 0.495, which is the predicted increase in Cricket scores per .01 increase in BAC.

3

deviance(linear.model)

## [1] 6265

SSE = 6265

4

darts <- darts %>% 
    mutate(BAC.c.sq = BAC.c^2)

model.quad1 <- lm(Darts ~ BAC.c + BAC.c.sq, darts)
model.quad2 <- lm(Darts ~ BAC.c + I(BAC.c^2), darts)
coef(model.quad1)

## (Intercept)       BAC.c    BAC.c.sq 
##        30.7        10.3      1987.9

coef(model.quad2) # same results! w00t!

## (Intercept)       BAC.c  I(BAC.c^2) 
##        30.7        10.3      1987.9

coef(model.quad1) -> quadratic.parameters

The intercept of the quadratic model is 30.736, which is the predicted Cricket score for someone with a mean level of BAC.

Why is the y-intercept notably different from the corresponding answer to #2. It’s because the nature of the model - linear vs quadratic - make a big difference in what’s expected to happen for various BAC levels. The graph below illustrates this nicely.

# visualizing why the y-intercept is different for the linear and quadratic
# models
ggplot(darts, aes(x = BAC, y = Darts)) +
    geom_point() +
    geom_smooth(method = 'lm', formula = y ~ x, se = F) +
    geom_smooth(method = 'lm', formula = y ~ poly(x, 2), se = F) +
    geom_vline(xintercept = BAC.m)

The linear slope for the quadratic model is 10.253, which is the simple or point slope of the BAC-Darts relationship at the mean BAC value, illustrated by the graph below.

ggplot(darts, aes(x = BAC, y = Darts)) +
    geom_point() +
    geom_smooth(method = 'lm', formula = y ~ poly(x, 2), se = F, col = "black") +
    geom_vline(xintercept = BAC.m, col = "black") +
    geom_abline(slope = quadratic.parameters[2], intercept = quadratic.parameters[1] + quadratic.parameters[2]*(-BAC.m), col = "red")

The quadratic slope for the quadratic model is 1987.89, which is (half) the rate at which the linear slope increases for every 1 percentage point increase in BAC. It’s a very-large number because the linear slope changes rapidly (from very negative until BAC \(\approx\) .05 to very positive when BAC \(\gt\) .10 or so) and because BAC is typically measured in hundredths of a percentage point.

5

deviance(model.quad1, t = F)

## [1] 3454

SSE = 3454

6

PRE <- (6265 - 3454) / 6265
Fquad <- PRE / ((1 - PRE)/97)

Based on the SSE values above, F = 78.942, which reasonably matches the F-ratio (or \(t^2\)) for the quadratic term in the quadratic model.

7

The graph in #4 does this!

8

The linear model shows a significant positive linear relationship between BAC and Cricket score; as BAC goes up, so does predicted Cricket score (which means that performance gets worse) This is complicated by the nonlinear relationship. The quadratic model, which accommodates the nonlinear relationship, shows that the relationship between BAC and Cricket score gets more positive as BAC increases, although an inspection of the graph of this relationship shows that the linear relationship starts out negative (which is good if one wants to succeed at Cricket), but becomes positive after a certain BAC is reached (which is bad if one wants to succeed at Cricket).

9

The y-intercept of 5 is the predicted performance of someone with a mean level of stress.

The -0.4 linear slope of \(stress_c\) indicates that for someone with a mean level of stress, performance is predicted to decline 0.4 points per unit of stress.

The -0.2 quadratic slope of \(stress_c^2\) indicates that the linear slope gets 2 \(\times\) 0.2 = 0.4 more negative per unit of increased stress.

5.2 (see below)

\(\hat{performance} = 5 - 0.4 \times stress_c - 0.2 \times stress_c^2\)

With stress = 3 (i.e., centered stress = -1)

\(\hat{performance} = 5 - 0.4 \times (-1) - 0.2 \times (-1^2)\)

\(\hat{performance} = 5 + 0.4 - 0.2 \times 1\)

\(\hat{performance} = 5 + 0.4 - 0.2 = 5.2\)

-0.8 (see below)

The linear slope at any given point is generated by

\(slope = -0.4 - 0.4 \times stress_c\)

So if stress = 5 (i.e., centered stress = +1)

\(slope = -0.4 - 0.4 \times 1\)

\(slope = -0.4 - 0.4 = -0.8\)

Bonus?

To help visualize the answers to #9, see the graph below

x <- -5:5  # some x values
dat <- data.frame(x, y = 5 - 0.4*x - 0.2*x^2) # a data frame
f <- function(x) 5 - 0.4*x - 0.2*x^2  # make the equation a function

# the plot
ggplot(dat, aes(x, y)) +
    xlab("centered stress") +
    stat_function(fun = f, col = "red") +
    geom_vline(xintercept = -1)              # predicted performance for stress.c = -1

ggplot(dat, aes(x, y)) +
    xlab("centered stress") +
    stat_function(fun = f, col = "red") +
    geom_vline(xintercept = 1) +
    geom_vline(xintercept = 0) +
    geom_abline(slope = -0.8, intercept = 5.2, col = "blue") # the tangent line at stress.c = +1 with a slope of -0.8

Problem Set 2 Answer Key

1

2

3

4

5

6

7

8

9

Bonus?