forked from PMacDaSci/r-intermediate
-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathggplot2-solutions1.Rmd
158 lines (108 loc) · 3.88 KB
/
ggplot2-solutions1.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
---
title: "Exercise Set 1 — Geoms and Aesthetics"
author: "Mark Dunning"
date: '`r format(Sys.time(), "Last modified: %d %b %Y")`'
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = T,eval=T)
```
## Exercise 1
These first few exercises will run through some of the simple principles of creating a ggplot2 object, assigning aesthetics mappings and geoms.
1. Read in the cleaned patients dataset as we saw in ggplot2 course earlier ("patients_clean_ggplot2.txt")
```{r exerciseReadin, echo=T}
patients_clean <- read.delim("patient-data-cleaned.txt",sep="\t")
```
### Scatterplots
2. Using the patient dataset generate a scatter plot of BMI versus Weight.
```{r exercise1}
library(ggplot2)
plot <- ggplot(data=patients_clean,
mapping=aes(x=BMI,y=Weight))+geom_point()
plot
```
3. Extending the plot from exercise 2, add a colour scale to the scatterplot based on the Height variable.
```{r exercise2}
plot <- ggplot(data=patients_clean,
mapping=aes(x=BMI,y=Weight,colour=Height))+geom_point()
plot
```
4. Following from exercise 3, split the BMI vs Weight plot into a grid of plots separated by Smoking status and Sex .
```{r exercise3}
plot <- ggplot(data=patients_clean,
mapping=aes(x=BMI,y=Weight,colour=Height))+geom_point()
plot+facet_grid(Sex~Smokes)
```
5. Using an additional geom, add an extra layer of a fit line to the solution from exercise 3.
```{r exercise4}
plot <- ggplot(data=patients_clean,
mapping=aes(x=BMI,y=Weight,colour=Height))+geom_point()+
geom_smooth()
plot
```
6. Does the fit in question 5 look good? Look at the description for ?geom_smooth() and adjust the method for a better fit.
```{r exercise5}
plot <- ggplot(data=patients_clean,
mapping=aes(x=BMI,y=Weight,colour=Height))+geom_point()+
geom_smooth(method="lm",se=F)
plot
```
###Boxplots and Violin plots
7. Generate a boxplot of BMIs comparing smokers and non-smokers.
```{r exercise6}
plot <- ggplot(data=patients_clean,
mapping=aes(x=Smokes,y=BMI))+geom_boxplot()
plot
```
8. Following from the boxplot comparing smokers and non-smokers in exercise 7, colour boxplot edges by Sex.
```{r exercise7}
plot <- ggplot(data=patients_clean,
mapping=aes(x=Smokes,y=BMI,colour=Sex))+geom_boxplot()
plot
```
9. Now reproduce the boxplots in exercise 8 (grouped by smoker, coloured by sex) but now include a separate facet for people of different age (using Age column).
```{r exercise8}
plot <- ggplot(data=patients_clean,
mapping=aes(x=Smokes,y=BMI,colour=Sex))+
geom_boxplot()+
facet_wrap(~Age)
plot
```
10. Produce a similar boxplot of BMIs but this time group data by Sex, colour by Age and facet by Smoking status.
HINT - Discrete values such as in factors are used for categorical data.
```{r exercise9}
plot <- ggplot(data=patients_clean,
mapping=aes(x=Sex,y=BMI,colour=factor(Age)))+
geom_boxplot()+
facet_wrap(~Smokes)
plot
```
11. Regenerate the solution to exercise 10 but this time using a violin plot.
```{r exercise10}
plot <- ggplot(data=patients_clean,
mapping=aes(x=Sex,y=BMI,colour=factor(Age)))+
geom_violin()+
facet_wrap(~Smokes)
plot
```
###Histogram and Density plots
12. Generate a histogram of BMIs with each bar coloured blue.
```{r exercise11}
plot <- ggplot(data=patients_clean,
mapping=aes(BMI))+
geom_histogram(fill="blue")
plot
```
13. Generate density plots of BMIs coloured by Sex.
HINT: alpha can be used to control transparancy.
```{r exercise12}
plot <- ggplot(data=patients_clean,
mapping=aes(BMI))+ geom_density(aes(fill=Sex),alpha=0.5)
plot
```
14. Generate a separate density plot of BMI coloured by sex for each Grade,
```{r exercise13}
plot <- ggplot(data=patients_clean,
mapping=aes(BMI))+ geom_density(aes(fill=Sex),alpha=0.5)
plot+facet_wrap(~Grade)
```