Skip to content

Commit cea0fac

Browse files
committed
0003
1 parent ebd6ffd commit cea0fac

21 files changed

+192
-110
lines changed

%

Whitespace-only changes.

0003_ggplot_modelfitting.Rmd

+41-26
Original file line numberDiff line numberDiff line change
@@ -43,8 +43,8 @@ Get source code for this RMarkdown script [here](https://github.com/hauselin/rtu
4343
Use `library()` to load packages at the top of each R script.
4444

4545
```{r loading packages, results="hide", message=FALSE, warning=FALSE}
46-
library(tidyverse); library(data.table); library(broom); library(dtplyr);
47-
library(lme4); library(lmerTest); library(ggbeeswarm); library(cowplot)
46+
library(tidyverse); library(data.table)
47+
library(lme4); library(lmerTest); library(ggbeeswarm)
4848
library(hausekeep)
4949
```
5050

@@ -118,9 +118,9 @@ plot1 # print plot
118118

119119
## Save a plot to your directory
120120

121-
Save to Figures directory, assuming this directory/folder already exists. You can also change the width/height of your figure and dpi (resolution/quality) of your figure (journals usually expect around 300 dpi).
121+
Save to Figures directory, assuming this directory/folder already exists. You can also change the width/height of your figure and dpi (resolution/quality) of your figure (since journals often expect around 300 dpi).
122122

123-
```{r, eval=F}
123+
```{r, eval=FALSE}
124124
ggsave(plot1, './Figures/iq_grades.png', width = 10, heigth = 10, dpi = 100)
125125
```
126126

@@ -305,7 +305,7 @@ ggplot(df1, aes(overallClass, iq)) + # y: iq
305305
```{r}
306306
ggplot(df1, aes(class, iq)) + # y: iq
307307
geom_quasirandom(alpha = 0.3) +
308-
stat_summary(fun.y = mean, geom = 'point', size = 3) + # apply mean function to y axis (fun.y = mean)
308+
stat_summary(fun = mean, geom = 'point', size = 3) + # apply mean function (fun = mean) (median or other functions work too)
309309
stat_summary(fun.data = mean_cl_normal, geom = 'errorbar', width = 0, size = 1) # apply mean_cl_normal function to data
310310
```
311311

@@ -428,9 +428,9 @@ ggplot(df1, aes(iq, grades)) +
428428
Test the relationship in the plot above
429429

430430
```{r}
431-
modelLinear <- lm(formula = iq ~ grades, data = df1)
432-
summary(modelLinear) # get model results and p values
433-
summaryh(modelLinear) # generates APA-formmatted results
431+
model_linear <- lm(formula = iq ~ grades, data = df1)
432+
summary(model_linear) # get model results and p values
433+
summaryh(model_linear) # generates APA-formmatted results (requires hausekeep package)
434434
```
435435

436436
Note the significant negative relationship between iq and grades.
@@ -450,14 +450,15 @@ ggplot(df1, aes(iq, grades, col = class)) +
450450
Test the relationship above by "controlling" for class
451451

452452
```{r}
453-
modelLinear_class <- lm(iq ~ grades + class, data = df1)
454-
summary(modelLinear_class) # get model results and p values
455-
summaryh(modelLinear_class)
453+
model_linear_class <- lm(iq ~ grades + class, data = df1)
454+
summary(model_linear_class) # get model results and p values
455+
summaryh(model_linear_class)
456456
```
457457

458458
Note the significantly positive relationship between iq and grades now.
459459

460460
### Reference groups and releveling (changing reference group)
461+
461462
R automatically recodes categorical/factor variables into 0s and 1s (i.e., dummy-coding). Alphabets/letters/characters/numbers that come first (a comes before b) will be coded 0, and those that follow will be coded 1.
462463

463464
In our case, class "a" has been coded 0 (reference group) and all other classes ("b", "c", "d") are contrasted against it, hence you have 3 other effects ("classb", "classc", "classd") that reflect the difference between class "a" and each of the other classes.
@@ -478,30 +479,35 @@ summaryh(lm(iq ~ grades + class, data = df1)) # quickly fit model and look at ou
478479
* concise expression: y ~ x1 * x2 (includes all main effects and interaction)
479480

480481
```{r}
481-
modelLinear_interact <- lm(iq ~ grades + class + grades:class, data = df1)
482-
summary(modelLinear_interact)
483-
summaryh(modelLinear_interact)
482+
model_linear_interact <- lm(iq ~ grades + class + grades:class, data = df1)
483+
summary(model_linear_interact)
484+
summaryh(model_linear_interact)
484485
```
485486

486487
#### Intercept-only model
487488

488489
R uses `1` to refer to the intercept
489490

490491
```{r}
491-
modelLinear_intercept <- lm(iq ~ 1, data = df1) # mean iq
492-
# summaryh(modelLinear_intercept)
492+
model_linear_intercept <- lm(iq ~ 1, data = df1) # mean iq
493+
coef(model_linear_intercept) # get coefficients from model
494+
# summaryh(model_linear_intercept)
493495
df1[, mean(iq)] # matches the intercept term
494496
mean(df1$iq) # same as above
495497
```
496498

497-
Remove intercept from model (if you ever need to do so...) by specifying `-1`
499+
Remove intercept from model (if you ever need to do so...) by specifying `-1`. Another way is to specify `0` in the syntax.
498500

499501
```{r}
500-
modelLinear_noIntercept <- lm(iq ~ grades - 1, data = df1) # substract intercept
501-
summary(modelLinear_noIntercept)
502-
# summaryh(modelLinear_noIntercept)
502+
model_linear_noIntercept <- lm(iq ~ grades - 1, data = df1) # substract intercept
503+
summary(model_linear_noIntercept)
504+
# summaryh(model_linear_noIntercept)
505+
506+
coef(lm(iq ~ 0 + grades, data = df1)) # no intercept
503507
```
504508

509+
Be careful when you remove the intercept (or set it to 0). See [my article](https://hausetutorials.netlify.app/posts/2019-07-24-what-happens-when-you-remove-or-set-the-intercept-to-0-in-regression-models/) to learn more.
510+
505511
### Fitting ANOVA with `anova` and `aov`
506512

507513
By default, R uses Type I sum of squares.
@@ -511,7 +517,7 @@ Let's test this model with ANOVA.
511517
```{r}
512518
ggplot(df1, aes(class, iq)) + # y: iq
513519
geom_quasirandom(alpha = 0.3) +
514-
stat_summary(fun.y = mean, geom = 'point', size = 3) + # apply mean function to y axis (fun.y = mean)
520+
stat_summary(fun = mean, geom = 'point', size = 3) + # apply mean function
515521
stat_summary(fun.data = mean_cl_normal, geom = 'errorbar', width = 0, size = 1) # apply mean_cl_normal function to data
516522
```
517523

@@ -529,7 +535,7 @@ Class * gender interaction (and main effects)
529535
```{r}
530536
ggplot(df1, aes(class, iq, col = gender)) + # y: iq
531537
geom_quasirandom(alpha = 0.3, dodge = 0.5) +
532-
stat_summary(fun.y = mean, geom = 'point', size = 3, position = position_dodge(0.5)) +
538+
stat_summary(fun = mean, geom = 'point', size = 3, position = position_dodge(0.5)) +
533539
stat_summary(fun.data = mean_cl_normal, geom = 'errorbar',
534540
width = 0, size = 1, position = position_dodge(0.5))
535541
```
@@ -550,6 +556,7 @@ anova_classGender
550556

551557
### Plotting and testing simple effects when you have interactions
552558

559+
* `interactions` package: see [here](https://interactions.jacob-long.com/) for more info
553560
* `sjPlot` package: see [here](http://www.strengejacke.de/sjPlot/)
554561
* [more tutorial and packages](https://jtools.jacob-long.com/)
555562

@@ -560,7 +567,7 @@ Fit models for this figure
560567
```{r}
561568
ggplot(df1, aes(class, iq, col = gender)) + # y: iq
562569
geom_quasirandom(alpha = 0.3, dodge = 0.5) +
563-
stat_summary(fun.y = mean, geom = 'point', size = 3, position = position_dodge(0.5)) +
570+
stat_summary(fun = mean, geom = 'point', size = 3, position = position_dodge(0.5)) +
564571
stat_summary(fun.data = mean_cl_normal, geom = 'errorbar',
565572
width = 0, size = 1, position = position_dodge(0.5))
566573
```
@@ -623,6 +630,15 @@ summaryh(m_interceptSlope)
623630
coef(m_interceptSlope) # check coefficients for each class
624631
```
625632

633+
#### Random intercept and slope model (no correlations between varying slopes and intercepts)
634+
635+
```{r}
636+
m_interceptSlope_noCor <- lmer(grades ~ iq + (1 + iq || class), data = df1)
637+
summary(m_interceptSlope_noCor)
638+
summaryh(m_interceptSlope_noCor)
639+
coef(m_interceptSlope_noCor) # check coefficients for each class
640+
```
641+
626642
#### Random slope model (fixed intercept)
627643

628644
```{r}
@@ -653,7 +669,6 @@ The dataset is in wide form. To visualize easily with `ggplot`, we need to conve
653669
gather(irisDT, meaureLength, length, -Species) %>% # convert from wide to long form
654670
ggplot(aes(Species, length, col = meaureLength)) + # no need to specify data because of piping
655671
geom_quasirandom(alpha = 0.3, dodge = 0.5)
656-
657672
```
658673

659674
MANOVA to test if species predicts length of sepal length and petal length?
@@ -701,7 +716,7 @@ Do different diets lead to different weights? Each chick is only assigned to one
701716
```{r}
702717
ggplot(cw, aes(Diet, weight)) +
703718
geom_quasirandom(alpha = 0.3) + # this line plots raw data and can be omitted, depending on your plotting preferences
704-
stat_summary(fun.y = mean, geom = 'point', size = 5) + # compute mean and plot
719+
stat_summary(fun = mean, geom = 'point', size = 5) + # compute mean and plot
705720
stat_summary(fun.data = mean_cl_normal, geom = 'errorbar', width = 0, size = 1) # compute between-sub confidence intervals
706721
```
707722

@@ -737,7 +752,7 @@ Plot with between-subjects error bars (WRONG but illustrative purposes)
737752
```{r}
738753
ggplot(cw, aes(Time, weight)) +
739754
geom_quasirandom(alpha = 0.1) + # this line plots raw data and can be omitted, depending on your plotting preferences
740-
stat_summary(fun.y = mean, geom = 'point') + # compute mean and plot
755+
stat_summary(fun = mean, geom = 'point') + # compute mean and plot
741756
stat_summary(fun.data = mean_cl_normal, geom = 'errorbar', width = 0) # compute between-sub confidence intervals
742757
```
743758

0004_dataform_join.Rmd

+1-1
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ Get source code for this RMarkdown script [here](https://github.com/hauselin/rtu
4040
Use `library()` to load packages at the top of each R script.
4141

4242
```{r loading packages, results="hide", message=FALSE, warning=FALSE}
43-
library(tidyverse); library(data.table); library(dtplyr)
43+
library(tidyverse); library(data.table)
4444
```
4545

4646
## Wide vs. long data (messy vs. tidy data)

0005_analyze_multi_subject_trial_data.Rmd

+3-3
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ This is how my directories/folders/files look like. You should make sure yours l
6868
## Load libraries
6969

7070
```{r libraries}
71-
library(tidyverse); library(data.table); library(dtplyr); library(lme4); library(lmerTest); library(ggbeeswarm); library(hausekeep)
71+
library(tidyverse); library(data.table); library(lme4); library(lmerTest); library(ggbeeswarm); library(hausekeep)
7272
```
7373

7474
<aside>
@@ -79,10 +79,10 @@ library(tidyverse); library(data.table); library(dtplyr); library(lme4); library
7979

8080
```{r read data}
8181
(files <- list.files(path = "./data/rt_acc_data_raw", pattern = "subject_", full.names = TRUE)) # find matching file names
82-
dt1 <- lapply(files, fread) %>% bind_rows() %>% as.data.table() # read data, combine items in list, convert to data.table
82+
dt1 <- bind_rows(lapply(files, fread)) %>% as.data.table() # read data, combine items in list, convert to data.table
8383
```
8484

85-
Note and explanation: `lapply` loops through each item in `files` and applies the `fread` function to each item. `bind_rows` combines all the dataframes stored as separate items in the list, and `tbl_dt()` converts the merged dataframe into `data.table` and `tibble` class.
85+
Note and explanation: `lapply` loops through each item in `files` and applies the `fread` function to each item. `bind_rows` combines all the dataframes stored as separate items in the list, and `as.data.table()` converts the merged dataframe into `data.table` class.
8686

8787
## Summarize experimental design
8888

0 commit comments

Comments
 (0)