-
Notifications
You must be signed in to change notification settings - Fork 9
/
Copy pathsimplifying-and-reusing.Rmd
89 lines (65 loc) · 2.76 KB
/
simplifying-and-reusing.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
---
title: 'Simplifying and re-using'
---
```{r, include=FALSE}
library(tidyverse)
```
## Simplifying and re-using {- #code-reuse}
Complex programming is beyond the scope of this guide, but if you sometimes find
yourself repeating the same piece of code over and over then it might be worth
writing a simple function.
The examples shown in this book are mostly very simple, and are typically not
repetitive. However in running your own analyses you may find that you start to
repeat chunks of code in a number of places (even across many files).
Not only is this sort of copying-and-pasting tedious, it can be hard to read and
maintain. For example, if you are repeating the same analysis for multiple
variables and discover an error in your calculations you would have to fix this
is in multiple places in your code. You can also introduce errors when copying
and pasting code in this way, and these can be hard to spot.
The sorts of tasks which can end up being repetitive include:
- Running models
- Extracting results from models you run
- Producing tables
- Specifying output settings for graphs
In these cases it might be worth writing your own function to facilitate this
repetition, or use some other forms of code re-use (e.g. see the
[example for `ggplot` graphics below](#ggplot-reuse)).
### Writing helper functions {- #helper-functions}
For example, in the section on logistic regression
[we wrote a helper function called `logistic()`](#helper-function-logistic)
which was simply a shortcut for `glm()` but with the correct 'family' argument
pre-specified.
For more information on writing your own R functions I recommend
[Hadley Wickham's 'R for Data Scientists', chapter 19](http://r4ds.had.co.nz/functions.html)
and, if necessary,
['Advanced R': http://adv-r.had.co.nz](http://adv-r.had.co.nz).
Or you prefer a book, try @grolemund2014hands.
### Re-using code with `ggplot` {- #ggplot-reuse}
Another common case where we might want to re-use code is when we produce a
number of similar plots, and where we might want to re-use many of the same
settings.
Sometimes repeating plots is best achieved by creating a single long-form
dataframe and [facetting the plot](#facetting-plots). However this may not
always be possible or desireable. Lets say we have two plots like these:
```{r}
plot.mpg <- mtcars %>%
ggplot(aes(factor(cyl), mpg)) +
geom_boxplot()
plot.wt <- mtcars %>%
ggplot(aes(factor(cyl), wt)) +
geom_boxplot()
```
And we want to add consistent labels and other settings to them. We can simply
type:
```{r}
myplotsettings <- xlab("Number of cylinders")
```
And then we simply add (`+`) these settings to each plot:
```{r}
plot.mpg + myplotsettings
```
And:
```{r}
plot.wt + myplotsettings
```
This reduces typing and makes easier to produce a consistent set of plots.