Tristan Mahr 2017-11-07
- Setup
- Growth curve models prep
- Evaluate the condition effect
- Include child-level predictors
- Maternal education
- Findings
library(dplyr)
library(lme4)
library(ggplot2)
library(hrbrthemes)
# Work relative to RStudio project
wd <- rprojroot::find_rstudio_root_file()
d_m <- readr::read_csv(file.path(wd, "data", "modeling.csv"))
Based on the visual exploratory data analysis, there is no average effect of hearing a non-native dialect on familiar word recognition. We need to confirm this observation through growth curve analysis.
First, we set some options for the analysis, namely the start and end time of the analysis window.
opts_gca <- list(
min_time = 250,
max_time = 1500
)
We model looks from 250 to 1500 ms.
We represent time in our growth curve models as a cubic orthogonal polynomial. This means that we convert our Time variable into three uncorrelated trend curves. These are the predictors we use to represent time. I wrote an R package for dealing with orthogonal polynomials, and I have a blog post discussing their use in growth curve models.
# Apply analysis window and create orthogonal polynomial of time
d_m <- d_m %>%
filter(opts_gca$min < Time, Time < opts_gca$max_time) %>%
polypoly::poly_add_columns(Time, 3, prefix = "ot", scale_width = 1)
We are going to use a generalized linear mixed effects model to perform the growth curve analysis. I'll try to summarize what this means very briefly, but Mirman's 2014 book is the standard reference on the topic.
We assume that the data is generated by a binomial process, where the number of sucesses (looks to target) depends on the total number of attempts (the number of looks) and the probability of sucess (fixating on the target). This process assumption shows up in the glmer()
call as family = binomial
. The number of successes and number of attempts show up in the glmer()
formula with response variable cbind(Primary, Others)
: The number of looks to the primary AOI (target) versus the number of looks to the others.
Our linear model estimates the probability part of the binomial process: At each time step, what is probability of fixating on the target image? If the linear trend is positive, then this probability of success increases over the course of the trial. If the condition effect is negative, then the average probability in one condition is smaller than in the other. But probabilities have a range from 0 to 1, but linear models run on the range of plus-minus infinity. We need a transformation to get from probability land to linear land and back again. This is the log-odds or logit transformation (from probability to linear) and the logistic transformation (from linear back to probability). These two are the basis for logistic regression.
Instead of modeling the probability of looking to target, we model the log-odds of looking to target. This transformation lets us generalize our linear model into probability land and to model data from a binomial process.
I have an extensive blog post illustrating what mixed effects models do, so I'll address their specific role here. We are going to assume that each child's growth curve is drawn from a distribution of related growth curves. We are also going to assume that each child's growth curve in each condition is also drawn from a distribution of related curves. Each child's growth curve is a combination of:
- An average growth curve:
... ~ ot1 + ot2 + ot3 + ...
- Plus a child's general growth curve features:
... ~ ... + (ot1 + ot2 + ot3 | ResearchID)
- Plus a child's specific growth curve features in each condition:
... ~ ... + (ot1 + ot2 + ot3 | HearsOwnDialect:ResearchID)
The first recognizes that there is an average kind of performance across all children. The second captures how a child will perform similarly on both conditions, so it's like a child's latent growth curve shape. The third captures specific condition level variability, so it's like a child's observed growth curve shape. In the formulas below, the shorthand (ot1 + ot2 + ot3 | ResearchID / HearsNativeDialect)
does both (2) and (3) with the x-has-y-nested-in-it operator x / y
.
First, we need to assess the condition effect. We try four models:
- Entirely omit condition.
- Add condition as a fixed effect, so it shapes the average growth curve.
- Add condition as a random effect but not as a fixed, so it captures each child's condition-level variability.
- Add condition as both, so that we capture condition-level variability within children but also estimate the effect of condition on the average growth curve shape.
glmer_controls <- glmerControl(
optimizer = "bobyqa",
optCtrl = list(maxfun = 2e5))
cond_omitted <- glmer(
cbind(Primary, Others) ~
(ot1 + ot2 + ot3) +
(ot1 + ot2 + ot3 | ResearchID),
data = d_m,
control = glmer_controls,
family = binomial)
summary(cond_omitted)
#> Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
#> Family: binomial ( logit )
#> Formula: cbind(Primary, Others) ~ (ot1 + ot2 + ot3) + (ot1 + ot2 + ot3 | ResearchID)
#> Data: d_m
#> Control: glmer_controls
#>
#> AIC BIC logLik deviance df.resid
#> 16465.0 16547.6 -8218.5 16437.0 2674
#>
#> Scaled residuals:
#> Min 1Q Median 3Q Max
#> -3.8096 -0.8099 -0.0042 0.8468 4.3706
#>
#> Random effects:
#> Groups Name Variance Std.Dev. Corr
#> ResearchID (Intercept) 0.15521 0.3940
#> ot1 0.60292 0.7765 0.64
#> ot2 0.18814 0.4337 -0.16 -0.19
#> ot3 0.06591 0.2567 -0.20 -0.39 0.22
#> Number of obs: 2688, groups: ResearchID, 56
#>
#> Fixed effects:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 0.02296 0.05295 0.434 0.665
#> ot1 2.19384 0.10560 20.776 < 2e-16 ***
#> ot2 -0.12877 0.06106 -2.109 0.035 *
#> ot3 -0.23904 0.03925 -6.090 1.13e-09 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Correlation of Fixed Effects:
#> (Intr) ot1 ot2
#> ot1 0.626
#> ot2 -0.144 -0.182
#> ot3 -0.172 -0.326 0.183
cond_as_fixed_eff <- glmer(
cbind(Primary, Others) ~
(ot1 + ot2 + ot3) * HearsNativeDialect +
(ot1 + ot2 + ot3 | ResearchID),
data = d_m,
control = glmer_controls,
family = binomial)
summary(cond_as_fixed_eff)
#> Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
#> Family: binomial ( logit )
#> Formula: cbind(Primary, Others) ~ (ot1 + ot2 + ot3) * HearsNativeDialect +
#> (ot1 + ot2 + ot3 | ResearchID)
#> Data: d_m
#> Control: glmer_controls
#>
#> AIC BIC logLik deviance df.resid
#> 16444.8 16550.9 -8204.4 16408.8 2670
#>
#> Scaled residuals:
#> Min 1Q Median 3Q Max
#> -3.7655 -0.8270 -0.0084 0.8512 4.3647
#>
#> Random effects:
#> Groups Name Variance Std.Dev. Corr
#> ResearchID (Intercept) 0.15533 0.3941
#> ot1 0.60334 0.7768 0.64
#> ot2 0.18760 0.4331 -0.16 -0.19
#> ot3 0.06466 0.2543 -0.20 -0.39 0.21
#> Number of obs: 2688, groups: ResearchID, 56
#>
#> Fixed effects:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 0.01757 0.05326 0.330 0.741430
#> ot1 2.14710 0.10733 20.004 < 2e-16 ***
#> ot2 -0.09527 0.06383 -1.493 0.135546
#> ot3 -0.16126 0.04326 -3.728 0.000193 ***
#> HearsNativeDialectTRUE 0.01050 0.01112 0.945 0.344741
#> ot1:HearsNativeDialectTRUE 0.09393 0.03807 2.467 0.013615 *
#> ot2:HearsNativeDialectTRUE -0.06652 0.03747 -1.775 0.075887 .
#> ot3:HearsNativeDialectTRUE -0.15483 0.03745 -4.134 3.56e-05 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Correlation of Fixed Effects:
#> (Intr) ot1 ot2 ot3 HNDTRU o1:HND o2:HND
#> ot1 0.614
#> ot2 -0.135 -0.169
#> ot3 -0.157 -0.285 0.155
#> HrsNtvDTRUE -0.104 -0.004 -0.021 0.004
#> ot1:HNDTRUE -0.003 -0.177 -0.004 -0.023 0.017
#> ot2:HNDTRUE -0.007 -0.003 -0.295 -0.006 0.070 0.007
#> ot3:HNDTRUE 0.001 -0.010 -0.004 -0.434 -0.010 0.048 0.010
cond_as_random_eff <- m2b <- glmer(
cbind(Primary, Others) ~
(ot1 + ot2 + ot3) +
(ot1 + ot2 + ot3 | ResearchID / HearsNativeDialect),
data = d_m,
control = glmer_controls,
family = binomial)
summary(cond_as_random_eff)
#> Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
#> Family: binomial ( logit )
#> Formula: cbind(Primary, Others) ~ (ot1 + ot2 + ot3) + (ot1 + ot2 + ot3 |
#> ResearchID/HearsNativeDialect)
#> Data: d_m
#> Control: glmer_controls
#>
#> AIC BIC logLik deviance df.resid
#> 14322.3 14463.8 -7137.1 14274.3 2664
#>
#> Scaled residuals:
#> Min 1Q Median 3Q Max
#> -2.51691 -0.39788 -0.00601 0.42764 2.92233
#>
#> Random effects:
#> Groups Name Variance Std.Dev. Corr
#> HearsNativeDialect:ResearchID (Intercept) 0.10412 0.3227
#> ot1 0.56565 0.7521 0.11
#> ot2 0.18255 0.4273 -0.08 0.01
#> ot3 0.12055 0.3472 0.02 -0.63 0.23
#> ResearchID (Intercept) 0.11524 0.3395
#> ot1 0.41396 0.6434 0.89
#> ot2 0.12233 0.3498 -0.12 -0.30
#> ot3 0.01721 0.1312 -0.46 -0.12 0.37
#> Number of obs: 2688, groups: HearsNativeDialect:ResearchID, 112; ResearchID, 56
#>
#> Fixed effects:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 0.02654 0.05497 0.483 0.6292
#> ot1 2.25601 0.11341 19.893 < 2e-16 ***
#> ot2 -0.12149 0.06488 -1.873 0.0611 .
#> ot3 -0.24048 0.04201 -5.724 1.04e-08 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Correlation of Fixed Effects:
#> (Intr) ot1 ot2
#> ot1 0.595
#> ot2 -0.093 -0.158
#> ot3 -0.151 -0.342 0.222
cond_as_both <- glmer(
cbind(Primary, Others) ~
(ot1 + ot2 + ot3) * HearsNativeDialect +
(ot1 + ot2 + ot3 | ResearchID / HearsNativeDialect),
data = d_m,
control = glmer_controls,
family = binomial)
summary(cond_as_both)
#> Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
#> Family: binomial ( logit )
#> Formula: cbind(Primary, Others) ~ (ot1 + ot2 + ot3) * HearsNativeDialect +
#> (ot1 + ot2 + ot3 | ResearchID/HearsNativeDialect)
#> Data: d_m
#> Control: glmer_controls
#>
#> AIC BIC logLik deviance df.resid
#> 14326.1 14491.2 -7135.1 14270.1 2660
#>
#> Scaled residuals:
#> Min 1Q Median 3Q Max
#> -2.50109 -0.40113 -0.00743 0.43700 2.93517
#>
#> Random effects:
#> Groups Name Variance Std.Dev. Corr
#> HearsNativeDialect:ResearchID (Intercept) 0.10418 0.3228
#> ot1 0.54881 0.7408 0.10
#> ot2 0.18155 0.4261 -0.08 0.03
#> ot3 0.11319 0.3364 0.03 -0.62 0.21
#> ResearchID (Intercept) 0.11517 0.3394
#> ot1 0.41812 0.6466 0.89
#> ot2 0.12250 0.3500 -0.12 -0.31
#> ot3 0.01885 0.1373 -0.47 -0.15 0.37
#> Number of obs: 2688, groups: HearsNativeDialect:ResearchID, 112; ResearchID, 56
#>
#> Fixed effects:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 0.02139 0.06312 0.339 0.73467
#> ot1 2.14375 0.13436 15.955 < 2e-16 ***
#> ot2 -0.10447 0.07866 -1.328 0.18411
#> ot3 -0.16798 0.05568 -3.017 0.00255 **
#> HearsNativeDialectTRUE 0.01022 0.06212 0.165 0.86927
#> ot1:HearsNativeDialectTRUE 0.22416 0.14578 1.538 0.12414
#> ot2:HearsNativeDialectTRUE -0.03417 0.08965 -0.381 0.70310
#> ot3:HearsNativeDialectTRUE -0.14514 0.07452 -1.948 0.05145 .
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Correlation of Fixed Effects:
#> (Intr) ot1 ot2 ot3 HNDTRU o1:HND o2:HND
#> ot1 0.465
#> ot2 -0.085 -0.102
#> ot3 -0.094 -0.396 0.201
#> HrsNtvDTRUE -0.492 -0.054 0.036 -0.017
#> ot1:HNDTRUE -0.049 -0.540 -0.014 0.336 0.100
#> ot2:HNDTRUE 0.031 -0.013 -0.566 -0.113 -0.060 0.025
#> ot3:HNDTRUE -0.012 0.273 -0.096 -0.665 0.026 -0.500 0.169
Use a model comparison on the four variants.
model | Df | AIC | BIC | logLik | Chisq | Chi Df | Pr(>Chisq) |
---|---|---|---|---|---|---|---|
cond_omitted | 14 | 16465 | 16548 | -8219 | NA | NA | NA |
cond_as_fixed_eff | 18 | 16445 | 16551 | -8204 | 28.25 | 4 | 0.0000 |
cond_as_random_eff | 24 | 14322 | 14464 | -7137 | 2134.51 | 6 | 0.0000 |
cond_as_both | 28 | 14326 | 14491 | -7135 | 4.14 | 4 | 0.3868 |
This is kind of unusual. All of the model comparison metrics favor the model that includes Child x Native Dialect random effects but omits the Native Dialect fixed effects. My interpretation:
- For each child, we have two growth curves: The data from hearing their native dialect and the data from hearing the non-native dialect. Their data is nested in these two kinds of blocks.
- This variability is really important for the model, so we include it in the random effects.
- But on average, hearing one's native dialect or not does not influence the growth curve features in a systematic way. So we can ignore using dialect condition as a predictor.
- When we include that native dialect in the model's random effects, we are incorporating information about the nesting of the data. Instead of capturing Child x Native-Dialect effects, it seems like the random effects are capturing Child x Block-of-Testing variability.
First, we remove the few participants without EVT or Maternal Education data.
to_remove <- d_m %>%
filter(is.na(EVT_Standard) | is.na(Maternal_Education_Group)) %>%
select(ResearchID, Dialect, Maternal_Education_Group, EVT_Standard) %>%
distinct()
to_remove
#> # A tibble: 3 x 4
#> ResearchID Dialect Maternal_Education_Group EVT_Standard
#> <chr> <chr> <chr> <int>
#> 1 552M MAE Mid NA
#> 2 459D AAE <NA> 95
#> 3 440D MAE <NA> 103
For these models, we are going to have the predictor variables interact with the intercept term and the linear term, so that the model captures how expressive vocabulary affects overall accuracy and the slope of the growth curve. We also scale EVT so that it has a mean of 0 and a standard deviation of 1, so that a unit change in the EVT predictor represents a 1 standard deviation change in vocabulary size. We also set maternal education so the mid group is the reference group.
d_narrow <- anti_join(d_m, to_remove) %>%
mutate(EVTc = scale(EVT_Standard) %>% as.vector(),
Medu = factor(Maternal_Education_Group, c("Mid", "Low", "High")))
To summarise these measures
d_narrow %>%
distinct(ResearchID, Medu, EVT_Standard, EVTc, Dialect) %>%
group_by(Dialect) %>%
summarise(n = n(),
EVT = mean(EVT_Standard) %>% round(),
EVT_sd = sd(EVT_Standard) %>% round(),
EVT_scale = mean(EVTc) %>% round(2),
EVT_scale_sd = sd(EVTc) %>% round(2),
Min_EVT = min(EVT_Standard),
Max_EVT = max(EVT_Standard)) %>%
ungroup() %>%
knitr::kable()
Dialect | n | EVT | EVT_sd | EVT_scale | EVT_scale_sd | Min_EVT | Max_EVT |
---|---|---|---|---|---|---|---|
AAE | 20 | 96 | 12 | -0.87 | 0.64 | 67 | 121 |
MAE | 33 | 121 | 15 | 0.53 | 0.81 | 92 | 147 |
d_narrow %>%
distinct(ResearchID, Medu, EVT_Standard, EVTc, Dialect) %>%
group_by(Dialect, Medu) %>%
summarise(n = n(),
EVT = mean(EVT_Standard) %>% round(),
EVT_sd = sd(EVT_Standard) %>% round(),
EVT_scale = mean(EVTc) %>% round(2),
EVT_scale_sd = sd(EVTc) %>% round(2),
Min_EVT = min(EVT_Standard),
Max_EVT = max(EVT_Standard)) %>%
ungroup() %>%
knitr::kable()
Dialect | Medu | n | EVT | EVT_sd | EVT_scale | EVT_scale_sd | Min_EVT | Max_EVT |
---|---|---|---|---|---|---|---|---|
AAE | Mid | 2 | 98 | 2 | -0.71 | 0.12 | 97 | 100 |
AAE | Low | 17 | 95 | 13 | -0.88 | 0.70 | 67 | 121 |
AAE | High | 1 | 93 | NaN | -1.01 | NaN | 93 | 93 |
MAE | Mid | 5 | 119 | 14 | 0.42 | 0.78 | 107 | 137 |
MAE | Low | 2 | 109 | 11 | -0.13 | 0.62 | 101 | 117 |
MAE | High | 26 | 122 | 15 | 0.60 | 0.82 | 92 | 147 |
First we model the effect of EVT.
m_evt <- glmer(
cbind(Primary, Others) ~
(1 + ot1) * EVTc + ot2 + ot3 +
(ot1 + ot2 + ot3 | ResearchID / HearsNativeDialect),
data = d_narrow,
control = glmer_controls,
family = binomial)
summary(m_evt)
#> Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
#> Family: binomial ( logit )
#> Formula: cbind(Primary, Others) ~ (1 + ot1) * EVTc + ot2 + ot3 + (ot1 +
#> ot2 + ot3 | ResearchID/HearsNativeDialect)
#> Data: d_narrow
#> Control: glmer_controls
#>
#> AIC BIC logLik deviance df.resid
#> 13574.2 13726.1 -6761.1 13522.2 2518
#>
#> Scaled residuals:
#> Min 1Q Median 3Q Max
#> -2.52685 -0.39297 -0.00563 0.42878 2.90126
#>
#> Random effects:
#> Groups Name Variance Std.Dev. Corr
#> HearsNativeDialect:ResearchID (Intercept) 0.10123 0.3182
#> ot1 0.57164 0.7561 0.15
#> ot2 0.17790 0.4218 -0.07 -0.04
#> ot3 0.12826 0.3581 0.02 -0.62 0.28
#> ResearchID (Intercept) 0.08611 0.2934
#> ot1 0.29907 0.5469 0.88
#> ot2 0.12784 0.3575 -0.21 -0.28
#> ot3 0.01348 0.1161 -0.08 0.28 0.50
#> Number of obs: 2544, groups: HearsNativeDialect:ResearchID, 106; ResearchID, 53
#>
#> Fixed effects:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 0.04940 0.05115 0.966 0.33413
#> ot1 2.26364 0.10710 21.136 < 2e-16 ***
#> EVTc 0.14806 0.05223 2.835 0.00459 **
#> ot2 -0.09941 0.06709 -1.482 0.13845
#> ot3 -0.25403 0.04316 -5.886 3.96e-09 ***
#> ot1:EVTc 0.29493 0.10581 2.787 0.00531 **
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Correlation of Fixed Effects:
#> (Intr) ot1 EVTc ot2 ot3
#> ot1 0.548
#> EVTc 0.000 0.002
#> ot2 -0.143 -0.162 0.005
#> ot3 -0.016 -0.264 0.000 0.277
#> ot1:EVTc 0.001 0.003 0.564 0.006 0.009
This model tells us that increasing expressive vocabulary by 1 SD improves overall accuracy and processing efficiency. This is entirely as expected.
Now we include the child's native dialect as a predictor of accuracy and processing efficieny. We also allow dialect to interact with vocabulary size. Because the two groups of children differ in the vocabulary levels, we expect one of the features to be redundant.
m_dialect <- glmer(
cbind(Primary, Others) ~
(1 + ot1) * Dialect + ot2 + ot3 +
(ot1 + ot2 + ot3 | ResearchID / HearsNativeDialect),
data = d_narrow,
control = glmer_controls,
family = binomial)
summary(m_dialect)
#> Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
#> Family: binomial ( logit )
#> Formula: cbind(Primary, Others) ~ (1 + ot1) * Dialect + ot2 + ot3 + (ot1 +
#> ot2 + ot3 | ResearchID/HearsNativeDialect)
#> Data: d_narrow
#> Control: glmer_controls
#>
#> AIC BIC logLik deviance df.resid
#> 13571.0 13722.8 -6759.5 13519.0 2518
#>
#> Scaled residuals:
#> Min 1Q Median 3Q Max
#> -2.51967 -0.39568 -0.00677 0.42617 2.91237
#>
#> Random effects:
#> Groups Name Variance Std.Dev. Corr
#> HearsNativeDialect:ResearchID (Intercept) 0.10109 0.3179
#> ot1 0.57229 0.7565 0.15
#> ot2 0.17731 0.4211 -0.07 -0.05
#> ot3 0.12843 0.3584 0.01 -0.62 0.28
#> ResearchID (Intercept) 0.08621 0.2936
#> ot1 0.27777 0.5270 0.88
#> ot2 0.12831 0.3582 -0.20 -0.28
#> ot3 0.01295 0.1138 -0.36 0.01 0.51
#> Number of obs: 2544, groups: HearsNativeDialect:ResearchID, 106; ResearchID, 53
#>
#> Fixed effects:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) -0.13647 0.08268 -1.651 0.098838 .
#> ot1 1.80707 0.16424 11.003 < 2e-16 ***
#> DialectMAE 0.29843 0.10438 2.859 0.004249 **
#> ot2 -0.10022 0.06712 -1.493 0.135398
#> ot3 -0.25430 0.04306 -5.906 3.51e-09 ***
#> ot1:DialectMAE 0.73277 0.20313 3.607 0.000309 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Correlation of Fixed Effects:
#> (Intr) ot1 DlcMAE ot2 ot3
#> ot1 0.542
#> DialectMAE -0.786 -0.417
#> ot2 -0.089 -0.106 0.002
#> ot3 -0.062 -0.225 0.001 0.277
#> ot1:DlctMAE -0.426 -0.768 0.542 0.004 0.006
m_dialect_evt <- glmer(
cbind(Primary, Others) ~
(1 + ot1) * Dialect + (1 + ot1) * EVTc + ot2 + ot3 +
(ot1 + ot2 + ot3 | ResearchID / HearsNativeDialect),
data = d_narrow,
control = glmer_controls,
family = binomial)
summary(m_dialect_evt)
#> Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
#> Family: binomial ( logit )
#> Formula: cbind(Primary, Others) ~ (1 + ot1) * Dialect + (1 + ot1) * EVTc +
#> ot2 + ot3 + (ot1 + ot2 + ot3 | ResearchID/HearsNativeDialect)
#> Data: d_narrow
#> Control: glmer_controls
#>
#> AIC BIC logLik deviance df.resid
#> 13573.5 13737.1 -6758.8 13517.5 2516
#>
#> Scaled residuals:
#> Min 1Q Median 3Q Max
#> -2.52309 -0.39604 -0.00578 0.42908 2.90916
#>
#> Random effects:
#> Groups Name Variance Std.Dev. Corr
#> HearsNativeDialect:ResearchID (Intercept) 0.10135 0.3184
#> ot1 0.57585 0.7588 0.14
#> ot2 0.17743 0.4212 -0.07 -0.05
#> ot3 0.12966 0.3601 0.02 -0.63 0.28
#> ResearchID (Intercept) 0.08182 0.2860
#> ot1 0.26313 0.5130 0.88
#> ot2 0.12843 0.3584 -0.20 -0.29
#> ot3 0.01190 0.1091 -0.22 0.14 0.53
#> Number of obs: 2544, groups: HearsNativeDialect:ResearchID, 106; ResearchID, 53
#>
#> Fixed effects:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) -0.06313 0.10171 -0.621 0.535
#> ot1 1.87620 0.20259 9.261 < 2e-16 ***
#> DialectMAE 0.18062 0.14198 1.272 0.203
#> EVTc 0.08552 0.07098 1.205 0.228
#> ot2 -0.09986 0.06714 -1.487 0.137
#> ot3 -0.25440 0.04297 -5.921 3.2e-09 ***
#> ot1:DialectMAE 0.62196 0.27944 2.226 0.026 *
#> ot1:EVTc 0.08061 0.13953 0.578 0.563
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Correlation of Fixed Effects:
#> (Intr) ot1 DlcMAE EVTc ot2 ot3 o1:DMA
#> ot1 0.538
#> DialectMAE -0.869 -0.465
#> EVTc 0.598 0.323 -0.689
#> ot2 -0.069 -0.082 -0.001 0.004
#> ot3 -0.026 -0.165 0.001 -0.001 0.278
#> ot1:DlctMAE -0.470 -0.858 0.541 -0.375 -0.001 0.000
#> ot1:EVTc 0.327 0.593 -0.375 0.546 0.006 0.006 -0.689
m_dialect_x_evt <- glmer(
cbind(Primary, Others) ~
(1 + ot1) * Dialect * EVTc + ot2 + ot3 +
(ot1 + ot2 + ot3 | ResearchID / HearsNativeDialect),
data = d_narrow,
control = glmer_controls,
family = binomial)
summary(m_dialect_x_evt)
#> Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
#> Family: binomial ( logit )
#> Formula: cbind(Primary, Others) ~ (1 + ot1) * Dialect * EVTc + ot2 + ot3 +
#> (ot1 + ot2 + ot3 | ResearchID/HearsNativeDialect)
#> Data: d_narrow
#> Control: glmer_controls
#>
#> AIC BIC logLik deviance df.resid
#> 13569.2 13744.4 -6754.6 13509.2 2514
#>
#> Scaled residuals:
#> Min 1Q Median 3Q Max
#> -2.52786 -0.39620 -0.00382 0.43208 2.90854
#>
#> Random effects:
#> Groups Name Variance Std.Dev. Corr
#> HearsNativeDialect:ResearchID (Intercept) 0.10124 0.3182
#> ot1 0.56579 0.7522 0.15
#> ot2 0.17744 0.4212 -0.07 -0.05
#> ot3 0.12844 0.3584 0.01 -0.62 0.28
#> ResearchID (Intercept) 0.07589 0.2755
#> ot1 0.18630 0.4316 0.89
#> ot2 0.12844 0.3584 -0.18 -0.26
#> ot3 0.01328 0.1152 -0.15 0.20 0.50
#> Number of obs: 2544, groups: HearsNativeDialect:ResearchID, 106; ResearchID, 53
#>
#> Fixed effects:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 0.07725 0.13750 0.562 0.57423
#> ot1 2.40025 0.25700 9.340 < 2e-16 ***
#> DialectMAE 0.07168 0.15735 0.456 0.64873
#> EVTc 0.24776 0.12976 1.909 0.05622 .
#> ot2 -0.10004 0.06715 -1.490 0.13625
#> ot3 -0.25454 0.04313 -5.901 3.61e-09 ***
#> ot1:DialectMAE 0.21686 0.29224 0.742 0.45806
#> ot1:EVTc 0.68541 0.24021 2.853 0.00433 **
#> DialectMAE:EVTc -0.22158 0.14953 -1.482 0.13837
#> ot1:DialectMAE:EVTc -0.83039 0.27726 -2.995 0.00274 **
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Correlation of Fixed Effects:
#> (Intr) ot1 DlcMAE EVTc ot2 ot3 ot1:DMAE o1:EVT DMAE:E
#> ot1 0.508
#> DialectMAE -0.879 -0.446
#> EVTc 0.814 0.414 -0.721
#> ot2 -0.042 -0.049 -0.002 0.004
#> ot3 -0.012 -0.120 0.000 0.001 0.278
#> ot1:DlctMAE -0.448 -0.872 0.510 -0.368 -0.002 -0.002
#> ot1:EVTc 0.417 0.809 -0.370 0.512 0.005 0.007 -0.720
#> DlctMAE:EVT -0.689 -0.349 0.467 -0.844 -0.002 -0.002 0.238 -0.431
#> o1:DMAE:EVT -0.352 -0.684 0.239 -0.430 -0.002 -0.004 0.465 -0.843 0.509
It's hard to tell what is going on without plotting. First, look at the dialect only effects.
This final plot would indicate that the MAE group performs very similarly, so that vocabulary size does not adequately differentiate these children, whereas a 1 SD change in the AAE group predicts a large improvement in word recognition. However, we need to take these predictions with a grain of salt because the -1 z-EVT MAE speakers and the +1 z-EVT AAE speakers are not represented in the data.
Model comparison indicates the dialect is a better child level predictor than expressive vocabulary, and that including simple effects of dialect and expressive vocabulary is redundant (which we expected).
model | Df | AIC | BIC | logLik | Chisq | Chi Df | Pr(>Chisq) |
---|---|---|---|---|---|---|---|
m_evt | 26 | 13574 | 13726 | -6761 | NA | NA | NA |
m_dialect | 26 | 13571 | 13723 | -6759 | 3.29 | 0 | 0.0000 |
m_dialect_evt | 28 | 13574 | 13737 | -6759 | 1.44 | 2 | 0.4868 |
m_dialect_x_evt | 30 | 13569 | 13744 | -6755 | 8.32 | 2 | 0.0156 |
Two of the three model comparison metrics, however, prefer the model with the interaction effect.
model | Df | AIC | BIC | logLik | Chisq | Chi Df | Pr(>Chisq) |
---|---|---|---|---|---|---|---|
m_evt | 26 | 13574 | 13726 | -6761 | NA | NA | NA |
m_dialect | 26 | 13571 | 13723 | -6759 | 3.29 | 0 | 0.0000 |
m_dialect_x_evt | 30 | 13569 | 13744 | -6755 | 9.76 | 4 | 0.0447 |
Now we include maternal education as a predictor and compare it against dialect.
m_medu <- glmer(
cbind(Primary, Others) ~
(1 + ot1) * Medu + ot2 + ot3 +
(ot1 + ot2 + ot3 | ResearchID / HearsNativeDialect),
data = d_narrow,
control = glmer_controls,
family = binomial)
summary(m_medu)
#> Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
#> Family: binomial ( logit )
#> Formula: cbind(Primary, Others) ~ (1 + ot1) * Medu + ot2 + ot3 + (ot1 +
#> ot2 + ot3 | ResearchID/HearsNativeDialect)
#> Data: d_narrow
#> Control: glmer_controls
#>
#> AIC BIC logLik deviance df.resid
#> 13567.0 13730.5 -6755.5 13511.0 2516
#>
#> Scaled residuals:
#> Min 1Q Median 3Q Max
#> -2.51366 -0.39537 -0.00772 0.42647 2.91558
#>
#> Random effects:
#> Groups Name Variance Std.Dev. Corr
#> HearsNativeDialect:ResearchID (Intercept) 0.09981 0.3159
#> ot1 0.56136 0.7492 0.17
#> ot2 0.17716 0.4209 -0.08 -0.04
#> ot3 0.12464 0.3530 -0.01 -0.62 0.28
#> ResearchID (Intercept) 0.07322 0.2706
#> ot1 0.29984 0.5476 0.91
#> ot2 0.12777 0.3574 -0.25 -0.29
#> ot3 0.01651 0.1285 -0.54 -0.20 0.46
#> Number of obs: 2544, groups: HearsNativeDialect:ResearchID, 106; ResearchID, 53
#>
#> Fixed effects:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 0.26468 0.13217 2.003 0.04522 *
#> ot1 2.14375 0.27533 7.786 6.91e-15 ***
#> MeduLow -0.47335 0.15453 -3.063 0.00219 **
#> MeduHigh -0.08940 0.14773 -0.605 0.54509
#> ot2 -0.10024 0.06703 -1.496 0.13478
#> ot3 -0.25430 0.04342 -5.857 4.72e-09 ***
#> ot1:MeduLow -0.31606 0.31949 -0.989 0.32255
#> ot1:MeduHigh 0.45656 0.30544 1.495 0.13498
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Correlation of Fixed Effects:
#> (Intr) ot1 MeduLw MedHgh ot2 ot3 ot1:ML
#> ot1 0.553
#> MeduLow -0.855 -0.467
#> MeduHigh -0.890 -0.486 0.766
#> ot2 -0.064 -0.065 0.002 0.003
#> ot3 -0.064 -0.154 0.000 0.002 0.276
#> ot1:MeduLow -0.471 -0.847 0.549 0.422 -0.002 0.005
#> ot1:MeduHgh -0.490 -0.882 0.422 0.550 0.002 0.008 0.765
m_medu_dialect <- glmer(
cbind(Primary, Others) ~
(1 + ot1) * Medu + (1 + ot1) * Dialect +
ot2 + ot3 +
(ot1 + ot2 + ot3 | ResearchID / HearsNativeDialect),
data = d_narrow,
control = glmer_controls,
family = binomial)
summary(m_medu_dialect)
#> Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
#> Family: binomial ( logit )
#> Formula: cbind(Primary, Others) ~ (1 + ot1) * Medu + (1 + ot1) * Dialect +
#> ot2 + ot3 + (ot1 + ot2 + ot3 | ResearchID/HearsNativeDialect)
#> Data: d_narrow
#> Control: glmer_controls
#>
#> AIC BIC logLik deviance df.resid
#> 13568.3 13743.5 -6754.1 13508.3 2514
#>
#> Scaled residuals:
#> Min 1Q Median 3Q Max
#> -2.51507 -0.39294 -0.00857 0.42891 2.91172
#>
#> Random effects:
#> Groups Name Variance Std.Dev. Corr
#> HearsNativeDialect:ResearchID (Intercept) 0.09950 0.3154
#> ot1 0.55296 0.7436 0.18
#> ot2 0.17730 0.4211 -0.08 -0.05
#> ot3 0.12270 0.3503 -0.02 -0.61 0.29
#> ResearchID (Intercept) 0.07370 0.2715
#> ot1 0.28375 0.5327 0.92
#> ot2 0.12882 0.3589 -0.25 -0.29
#> ot3 0.01833 0.1354 -0.49 -0.16 0.43
#> Number of obs: 2544, groups: HearsNativeDialect:ResearchID, 106; ResearchID, 53
#>
#> Fixed effects:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 0.28972 0.18236 1.589 0.11213
#> ot1 1.82183 0.37414 4.869 1.12e-06 ***
#> MeduLow -0.49502 0.18893 -2.620 0.00879 **
#> MeduHigh -0.08060 0.15316 -0.526 0.59872
#> DialectMAE -0.03513 0.17152 -0.205 0.83771
#> ot2 -0.10066 0.06719 -1.498 0.13407
#> ot3 -0.25393 0.04361 -5.823 5.77e-09 ***
#> ot1:MeduLow -0.03697 0.38655 -0.096 0.92381
#> ot1:MeduHigh 0.35221 0.31285 1.126 0.26024
#> ot1:DialectMAE 0.44263 0.35169 1.259 0.20818
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Correlation of Fixed Effects:
#> (Intr) ot1 MeduLw MedHgh DlcMAE ot2 ot3 ot1:ML ot1:MH
#> ot1 0.555
#> MeduLow -0.903 -0.498
#> MeduHigh -0.446 -0.248 0.458
#> DialectMAE -0.688 -0.377 0.574 -0.258
#> ot2 -0.049 -0.045 0.003 0.002 0.003
#> ot3 -0.047 -0.113 0.001 0.001 0.001 0.274
#> ot1:MeduLow -0.499 -0.900 0.552 0.256 0.315 -0.003 0.008
#> ot1:MeduHgh -0.250 -0.441 0.256 0.554 -0.140 0.003 0.005 0.453
#> ot1:DlctMAE -0.377 -0.687 0.314 -0.140 0.547 -0.002 0.008 0.576 -0.260
m_medu_x_dialect <- glmer(
cbind(Primary, Others) ~
(1 + ot1) * Medu * Dialect +
ot2 + ot3 +
(ot1 + ot2 + ot3 | ResearchID / HearsNativeDialect),
data = d_narrow,
control = glmer_controls,
family = binomial)
summary(m_medu_x_dialect)
#> Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
#> Family: binomial ( logit )
#> Formula: cbind(Primary, Others) ~ (1 + ot1) * Medu * Dialect + ot2 + ot3 +
#> (ot1 + ot2 + ot3 | ResearchID/HearsNativeDialect)
#> Data: d_narrow
#> Control: glmer_controls
#>
#> AIC BIC logLik deviance df.resid
#> 13571.9 13770.5 -6752.0 13503.9 2510
#>
#> Scaled residuals:
#> Min 1Q Median 3Q Max
#> -2.51615 -0.39173 -0.00872 0.42899 2.91104
#>
#> Random effects:
#> Groups Name Variance Std.Dev. Corr
#> HearsNativeDialect:ResearchID (Intercept) 0.09919 0.3149
#> ot1 0.54772 0.7401 0.19
#> ot2 0.17726 0.4210 -0.07 -0.05
#> ot3 0.12106 0.3479 -0.03 -0.61 0.29
#> ResearchID (Intercept) 0.06802 0.2608
#> ot1 0.26435 0.5142 0.92
#> ot2 0.12968 0.3601 -0.22 -0.25
#> ot3 0.01972 0.1404 -0.47 -0.13 0.41
#> Number of obs: 2544, groups: HearsNativeDialect:ResearchID, 106; ResearchID, 53
#>
#> Fixed effects:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 0.03069 0.24244 0.127 0.89925
#> ot1 1.59104 0.49721 3.200 0.00137 **
#> MeduLow -0.21931 0.25571 -0.858 0.39108
#> MeduHigh 0.41059 0.42043 0.977 0.32877
#> DialectMAE 0.32656 0.28491 1.146 0.25172
#> ot2 -0.10096 0.06731 -1.500 0.13360
#> ot3 -0.25361 0.04373 -5.800 6.65e-09 ***
#> ot1:MeduLow 0.17091 0.52353 0.326 0.74407
#> ot1:MeduHigh 1.41934 0.86681 1.637 0.10154
#> ot1:DialectMAE 0.75937 0.58370 1.301 0.19327
#> MeduLow:DialectMAE -0.52497 0.38158 -1.376 0.16889
#> MeduHigh:DialectMAE -0.60248 0.45159 -1.334 0.18216
#> ot1:MeduLow:DialectMAE -0.07245 0.78713 -0.092 0.92667
#> ot1:MeduHigh:DialectMAE -1.18460 0.93069 -1.273 0.20308
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Model comparison favors the model without the dialect by maternal education effect.
model | Df | AIC | BIC | logLik | Chisq | Chi Df | Pr(>Chisq) |
---|---|---|---|---|---|---|---|
m_dialect | 26 | 13571 | 13723 | -6759 | NA | NA | NA |
m_medu | 28 | 13567 | 13731 | -6755 | 8.01 | 2 | 0.0183 |
m_medu_dialect | 30 | 13568 | 13743 | -6754 | 2.70 | 2 | 0.2591 |
m_medu_x_dialect | 34 | 13572 | 13771 | -6752 | 4.32 | 4 | 0.3639 |
model | Df | AIC | BIC | logLik | Chisq | Chi Df | Pr(>Chisq) |
---|---|---|---|---|---|---|---|
m_dialect | 26 | 13571 | 13723 | -6759 | NA | NA | NA |
m_medu | 28 | 13567 | 13731 | -6755 | 8.01 | 2 | 0.0183 |
m_medu_x_dialect | 34 | 13572 | 13771 | -6752 | 7.03 | 6 | 0.3185 |
As before, the models are best understood through plotting.
Some implications from these models:
- There is no difference between mid and high maternal education.
- The dialect effect disappears when we account for maternal education.
Finally, we pit expressive vocabulary against maternal education.
m_medu_evt <- glmer(
cbind(Primary, Others) ~
(1 + ot1) * Medu + (1 + ot1) * EVTc +
ot2 + ot3 +
(ot1 + ot2 + ot3 | ResearchID / HearsNativeDialect),
data = d_narrow,
control = glmer_controls,
family = binomial)
summary(m_medu_evt)
#> Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
#> Family: binomial ( logit )
#> Formula: cbind(Primary, Others) ~ (1 + ot1) * Medu + (1 + ot1) * EVTc +
#> ot2 + ot3 + (ot1 + ot2 + ot3 | ResearchID/HearsNativeDialect)
#> Data: d_narrow
#> Control: glmer_controls
#>
#> AIC BIC logLik deviance df.resid
#> 13570.3 13745.5 -6755.1 13510.3 2514
#>
#> Scaled residuals:
#> Min 1Q Median 3Q Max
#> -2.51713 -0.39302 -0.00913 0.42774 2.91014
#>
#> Random effects:
#> Groups Name Variance Std.Dev. Corr
#> HearsNativeDialect:ResearchID (Intercept) 0.09993 0.3161
#> ot1 0.55921 0.7478 0.17
#> ot2 0.17747 0.4213 -0.08 -0.04
#> ot3 0.12460 0.3530 -0.01 -0.61 0.28
#> ResearchID (Intercept) 0.07101 0.2665
#> ot1 0.27978 0.5289 0.91
#> ot2 0.12783 0.3575 -0.25 -0.29
#> ot3 0.01672 0.1293 -0.43 -0.07 0.46
#> Number of obs: 2544, groups: HearsNativeDialect:ResearchID, 106; ResearchID, 53
#>
#> Fixed effects:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 0.25566 0.13220 1.934 0.0531 .
#> ot1 2.12197 0.27486 7.720 1.16e-14 ***
#> MeduLow -0.42883 0.16768 -2.557 0.0105 *
#> MeduHigh -0.10301 0.14863 -0.693 0.4883
#> EVTc 0.04238 0.06413 0.661 0.5087
#> ot2 -0.10002 0.06706 -1.491 0.1358
#> ot3 -0.25406 0.04346 -5.845 5.06e-09 ***
#> ot1:MeduLow -0.20659 0.34700 -0.595 0.5516
#> ot1:MeduHigh 0.42266 0.30714 1.376 0.1688
#> ot1:EVTc 0.10431 0.13296 0.785 0.4327
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Correlation of Fixed Effects:
#> (Intr) ot1 MeduLw MedHgh EVTc ot2 ot3 ot1:ML ot1:MH
#> ot1 0.549
#> MeduLow -0.821 -0.448
#> MeduHigh -0.863 -0.468 0.639
#> EVTc -0.099 -0.059 0.397 -0.142
#> ot2 -0.063 -0.064 0.004 0.002 0.006
#> ot3 -0.051 -0.140 0.000 0.001 0.000 0.276
#> ot1:MeduLow -0.450 -0.814 0.550 0.349 0.226 0.000 0.009
#> ot1:MeduHgh -0.471 -0.856 0.349 0.544 -0.075 0.002 0.006 0.637
#> ot1:EVTc -0.060 -0.099 0.226 -0.075 0.556 0.004 0.010 0.400 -0.142
m_medu_x_evt <- glmer(
cbind(Primary, Others) ~
(1 + ot1) * Medu * EVTc +
ot2 + ot3 +
(ot1 + ot2 + ot3 | ResearchID / HearsNativeDialect),
data = d_narrow,
control = glmer_controls,
family = binomial)
summary(m_medu_x_evt)
#> Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
#> Family: binomial ( logit )
#> Formula: cbind(Primary, Others) ~ (1 + ot1) * Medu * EVTc + ot2 + ot3 +
#> (ot1 + ot2 + ot3 | ResearchID/HearsNativeDialect)
#> Data: d_narrow
#> Control: glmer_controls
#>
#> AIC BIC logLik deviance df.resid
#> 13570.7 13769.3 -6751.4 13502.7 2510
#>
#> Scaled residuals:
#> Min 1Q Median 3Q Max
#> -2.52185 -0.39298 -0.00741 0.43114 2.90626
#>
#> Random effects:
#> Groups Name Variance Std.Dev. Corr
#> HearsNativeDialect:ResearchID (Intercept) 0.09980 0.3159
#> ot1 0.54725 0.7398 0.18
#> ot2 0.17734 0.4211 -0.08 -0.05
#> ot3 0.12345 0.3514 -0.01 -0.61 0.28
#> ResearchID (Intercept) 0.06596 0.2568
#> ot1 0.20877 0.4569 0.91
#> ot2 0.12766 0.3573 -0.25 -0.30
#> ot3 0.01790 0.1338 -0.36 0.00 0.44
#> Number of obs: 2544, groups: HearsNativeDialect:ResearchID, 106; ResearchID, 53
#>
#> Fixed effects:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 0.24926 0.13046 1.911 0.0560 .
#> ot1 2.11059 0.25777 8.188 2.66e-16 ***
#> MeduLow -0.31849 0.18242 -1.746 0.0808 .
#> MeduHigh -0.06288 0.15025 -0.419 0.6756
#> EVTc 0.07504 0.16415 0.457 0.6476
#> ot2 -0.10015 0.06703 -1.494 0.1351
#> ot3 -0.25420 0.04360 -5.831 5.52e-09 ***
#> ot1:MeduLow 0.22273 0.35864 0.621 0.5346
#> ot1:MeduHigh 0.56030 0.29527 1.898 0.0577 .
#> ot1:EVTc 0.12159 0.32229 0.377 0.7060
#> MeduLow:EVTc 0.09587 0.19778 0.485 0.6279
#> MeduHigh:EVTc -0.09522 0.18011 -0.529 0.5970
#> ot1:MeduLow:EVTc 0.50160 0.38816 1.292 0.1963
#> ot1:MeduHigh:EVTc -0.25203 0.35392 -0.712 0.4764
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
None of the effects of EVT appear to model from the model summary. Model comparison provides no support for the model with EVT and maternal education. Only one of the three model comparison metrics support the model with EVT and maternal education.
model | Df | AIC | BIC | logLik | Chisq | Chi Df | Pr(>Chisq) |
---|---|---|---|---|---|---|---|
m_medu | 28 | 13567 | 13731 | -6755 | NA | NA | NA |
m_medu_evt | 30 | 13570 | 13746 | -6755 | 0.68 | 2 | 0.7102 |
m_medu_x_evt | 34 | 13571 | 13769 | -6751 | 7.53 | 4 | 0.1104 |
model | Df | AIC | BIC | logLik | Chisq | Chi Df | Pr(>Chisq) |
---|---|---|---|---|---|---|---|
m_dialect | 26 | 13571 | 13723 | -6759 | NA | NA | NA |
m_medu_x_evt | 34 | 13571 | 13769 | -6751 | 16.22 | 8 | 0.0394 |
It might be worth "refitting" the last model with low group as the reference condition to see what happens. (The results or predictions don't change... just the way the values are reported changed.) This model shows that the expressive vocabulary x Time effect is significant in the low maternal education group.
d_narrow2 <- d_narrow %>%
mutate(Medu = factor(Medu, c("Low", "Mid", "High")))
m_medu_x_evt2 <- glmer(
cbind(Primary, Others) ~
(1 + ot1) * Medu * EVTc +
ot2 + ot3 +
(ot1 + ot2 + ot3 | ResearchID / HearsNativeDialect),
data = d_narrow2,
control = glmer_controls,
family = binomial)
summary(m_medu_x_evt2)
#> Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
#> Family: binomial ( logit )
#> Formula: cbind(Primary, Others) ~ (1 + ot1) * Medu * EVTc + ot2 + ot3 +
#> (ot1 + ot2 + ot3 | ResearchID/HearsNativeDialect)
#> Data: d_narrow2
#> Control: glmer_controls
#>
#> AIC BIC logLik deviance df.resid
#> 13570.7 13769.3 -6751.4 13502.7 2510
#>
#> Scaled residuals:
#> Min 1Q Median 3Q Max
#> -2.52185 -0.39298 -0.00741 0.43114 2.90626
#>
#> Random effects:
#> Groups Name Variance Std.Dev. Corr
#> HearsNativeDialect:ResearchID (Intercept) 0.09980 0.3159
#> ot1 0.54725 0.7398 0.18
#> ot2 0.17734 0.4211 -0.08 -0.05
#> ot3 0.12345 0.3514 -0.01 -0.61 0.28
#> ResearchID (Intercept) 0.06596 0.2568
#> ot1 0.20877 0.4569 0.91
#> ot2 0.12766 0.3573 -0.25 -0.31
#> ot3 0.01790 0.1338 -0.36 0.00 0.44
#> Number of obs: 2544, groups: HearsNativeDialect:ResearchID, 106; ResearchID, 53
#>
#> Fixed effects:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) -0.06923 0.12257 -0.565 0.57220
#> ot1 2.33333 0.24332 9.590 < 2e-16 ***
#> MeduMid 0.31849 0.18243 1.746 0.08084 .
#> MeduHigh 0.25561 0.14680 1.741 0.08164 .
#> EVTc 0.17091 0.11591 1.475 0.14034
#> ot2 -0.10015 0.06703 -1.494 0.13512
#> ot3 -0.25420 0.04360 -5.831 5.53e-09 ***
#> ot1:MeduMid -0.22273 0.35867 -0.621 0.53460
#> ot1:MeduHigh 0.33756 0.28939 1.166 0.24343
#> ot1:EVTc 0.62321 0.22736 2.741 0.00612 **
#> MeduMid:EVTc -0.09587 0.19780 -0.485 0.62790
#> MeduHigh:EVTc -0.19109 0.13582 -1.407 0.15945
#> ot1:MeduMid:EVTc -0.50163 0.38821 -1.292 0.19631
#> ot1:MeduHigh:EVTc -0.75365 0.26663 -2.827 0.00470 **
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Again, we plot the model fits.
predictions <- new_data_template %>%
tidyr::crossing(EVTc = -1:1) %>%
mutate(fitted = predict(m_medu_evt, newdata = ., re.form = ~ 0, type = "resp"))
ggplot(predictions) +
aes(x = Time, y = fitted, color = MeduPlot, linetype = factor(EVTc)) +
hline_chance +
geom_line(size = 1) +
viridis::scale_color_viridis(discrete = TRUE, end = .7, option = "C") +
theme_ipsum_rc(axis_title_size = 11, plot_title_size = 13) +
labs(
x = plot_text$x_time,
linetype = "EVT (z-score)",
color = "Maternal Ed.",
y = "Predicted proportion of looks",
caption = "Conditioned on maternal ed. and vocabulary")
predictions <- new_data_template %>%
tidyr::crossing(EVTc = -1:1) %>%
mutate(fitted = predict(m_medu_x_evt, newdata = .,
re.form = ~ 0, type = "resp"))
ggplot(predictions) +
aes(x = Time, y = fitted, color = MeduPlot, linetype = factor(EVTc)) +
hline_chance +
geom_line(size = 1) +
viridis::scale_color_viridis(discrete = TRUE, end = .7, option = "C") +
theme_ipsum_rc(axis_title_size = 11, plot_title_size = 13) +
labs(
x = plot_text$x_time,
linetype = "EVT (z-score)",
color = "Maternal Ed.",
y = "Predicted proportion of looks",
caption = "Conditioned on maternal ed. x vocabulary")
Here's what the interaction between maternal education and expressive vocabulary is capturing: The blue lines (the high maternal education group) are all complete on top of each other. There appears to be no differentiating effects of vocabulary in this group. But for the low maternal education group, expressive vocabulary does make a difference.
- No effect of experiment condition.
- Our dialect groups differed in maternal education and vocabulary size too. Each of these three measures predicted growth curve features but the best single predictor among them is maternal education.
- Vocabulary size might matter more the low maternal education group but the evidence is not overwhelming.
sessioninfo::session_info()
#> - Session info -----------------------------------------------------------------------------------
#> setting value
#> version R version 3.4.1 (2017-06-30)
#> os Windows 7 x64 SP 1
#> system x86_64, mingw32
#> ui RTerm
#> language (EN)
#> collate English_United States.1252
#> tz America/Chicago
#> date 2017-11-07
#>
#> - Packages ---------------------------------------------------------------------------------------
#> package * version date source
#> assertthat 0.2.0 2017-04-11 CRAN (R 3.3.2)
#> backports 1.1.1 2017-09-25 CRAN (R 3.4.1)
#> bindr 0.1 2016-11-13 CRAN (R 3.4.0)
#> bindrcpp * 0.2 2017-06-17 CRAN (R 3.4.0)
#> clisymbols 1.2.0 2017-08-04 Github (gaborcsardi/clisymbols@e49b4f5)
#> codetools 0.2-15 2016-10-05 CRAN (R 3.4.1)
#> colorspace 1.3-2 2016-12-14 CRAN (R 3.3.2)
#> digest 0.6.12 2017-01-27 CRAN (R 3.3.2)
#> dplyr * 0.7.4 2017-09-28 CRAN (R 3.4.2)
#> evaluate 0.10.1 2017-06-24 CRAN (R 3.4.1)
#> extrafont 0.17 2014-12-08 CRAN (R 3.1.3)
#> extrafontdb 1.0 2012-06-11 CRAN (R 3.1.3)
#> ggplot2 * 2.2.1 2016-12-30 CRAN (R 3.4.1)
#> glue 1.2.0 2017-10-29 CRAN (R 3.4.2)
#> gridExtra 2.3 2017-09-09 CRAN (R 3.4.1)
#> gtable 0.2.0 2016-02-26 CRAN (R 3.2.3)
#> highr 0.6 2016-05-09 CRAN (R 3.2.3)
#> hms 0.3 2016-11-22 CRAN (R 3.3.2)
#> hrbrthemes * 0.2.0 2017-03-20 Github (hrbrmstr/hrbrthemes@11a5c1d)
#> htmltools 0.3.6 2017-04-28 CRAN (R 3.4.0)
#> knitr * 1.17 2017-08-10 CRAN (R 3.4.2)
#> labeling 0.3 2014-08-23 CRAN (R 3.1.1)
#> lattice 0.20-35 2017-03-25 CRAN (R 3.3.3)
#> lazyeval 0.2.1 2017-10-29 CRAN (R 3.4.2)
#> lme4 * 1.1-14 2017-09-27 CRAN (R 3.4.2)
#> magrittr 1.5 2014-11-22 CRAN (R 3.1.2)
#> MASS 7.3-47 2017-02-26 CRAN (R 3.4.1)
#> Matrix * 1.2-11 2017-08-16 CRAN (R 3.4.2)
#> minqa 1.2.4 2014-10-09 CRAN (R 3.1.1)
#> munsell 0.4.3 2016-02-13 CRAN (R 3.2.3)
#> nlme 3.1-131 2017-02-06 CRAN (R 3.4.1)
#> nloptr 1.0.4 2014-08-04 CRAN (R 3.1.1)
#> pkgconfig 2.0.1 2017-03-21 CRAN (R 3.3.3)
#> plyr 1.8.4 2016-06-08 CRAN (R 3.3.0)
#> polypoly 0.0.2 2017-05-30 Github (tjmahr/polypoly@4370bb5)
#> purrr 0.2.4 2017-10-18 CRAN (R 3.4.2)
#> R6 2.2.2 2017-06-17 CRAN (R 3.4.0)
#> Rcpp 0.12.13 2017-09-28 CRAN (R 3.4.2)
#> readr 1.1.1 2017-05-16 CRAN (R 3.4.0)
#> rlang 0.1.4 2017-11-05 CRAN (R 3.4.2)
#> rmarkdown 1.6 2017-06-15 CRAN (R 3.4.2)
#> rprojroot 1.2 2017-01-16 CRAN (R 3.3.2)
#> Rttf2pt1 1.3.4 2016-05-19 CRAN (R 3.2.5)
#> scales 0.5.0 2017-08-24 CRAN (R 3.4.1)
#> sessioninfo 1.0.1 2017-09-13 Github (r-lib/sessioninfo@e813de4)
#> stringi 1.1.5 2017-04-07 CRAN (R 3.3.3)
#> stringr 1.2.0 2017-02-18 CRAN (R 3.3.2)
#> tibble 1.3.4 2017-08-22 CRAN (R 3.4.1)
#> tidyr 0.7.2 2017-10-16 CRAN (R 3.4.2)
#> tidyselect 0.2.3 2017-11-06 CRAN (R 3.4.2)
#> viridis 0.4.0 2017-03-27 CRAN (R 3.3.3)
#> viridisLite 0.2.0 2017-03-24 CRAN (R 3.3.2)
#> withr 2.1.0.9000 2017-11-02 Github (jimhester/withr@8ba5e46)
#> yaml 2.1.14 2016-11-12 CRAN (R 3.4.2)