-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdeveloping script.Rmd
480 lines (335 loc) · 18 KB
/
developing script.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
---
title: "R Markdown Training"
author: "Sally Thompson"
date: "28 July 2022"
output:
html_document: default
---
```{r}
#| label: setup
#| include: false
#|
# include = FALSE: ensures that code and results don't appear in the knitted
# document. So when you knit the document this chunk will not appear.
# This is a global option, so this will apply to all of the code chunks in the document
knitr::opts_chunk$set(message = FALSE,
warning = FALSE,
echo = FALSE)
# change the echo to TRUE to show the R code in the document
n <- 0
```
Introduction
This guide has been written for you to follow along, you will create your own
version of this file as you go. It will work best if you have two screens - have
this document open on one screen, and R Studio with the code and output on the other (main) screen.
Before we go any further, copy the code below into your console and run it. This
will identify and install any missing packages that are required for this script
to run correctly.
``` {r}
#| eval: false
#| echo: true
## If a package is installed, it will be loaded. If any
## are not, the missing package(s) will be installed from CRAN
## First specify the packages of interest
packages = c("dplyr", "tidyr", "lubridate", "zoo",
"ggplot2", "DT", "reactable", "plotly")
## Now installs if required
package.check <- lapply(
packages,
FUN = function(x) {
if (!require(x, character.only = TRUE)) {
install.packages(x, dependencies = TRUE)
}
}
)
```
Start by opening the Rmd file "developing script.Rmd" in R Studio. This is an
unformatted version of the markdown file that created this guide. You run a markdown
by 'knit'ing it (look for the icon - it's a ball of wool with a knitting needle in
it). Click the 'knit' button and look at the document it creates. How does it compare
to this file? (The first time you knit an Rmd script you may receive warning messages
about packages that are required but aren't yet installed. We have tried to pre-empt
this by running the code above, but if it still happens install those packages and knit again.)
As you work through this guide you will add formatting and content to your script
until your knitted document looks like this one!
This guide does not go into the magic that happens between hitting the 'knit' button
and the output file being produced. There are plenty of resources on the internet
if you want to explore this further. For more details about all things markdown visit rmarkdown.rstudio.com.
This guide will start by looking at formatting your document, then will add in some
data and charts, and improve how they look. The final section adds some interactivity
to some of the charts.
What is R Markdown?
Markdown is a way to create and format documents, the types of which include (but
aren't limited to) html, pdf, Word, dashboards, slides etc. Two major benefits of
creating these using R Markdown are reproducibility and automation. Code can be
embedded, so the creation of properly formatted text plus charts and visualisations
all happen with one click. This training guide will focus exclusively on producing
html output, but hopefully by the end of it you will have the confidence and skills
to investigate other output formats.
Formatting
You will no doubt be familiar with headers.
They can get smaller
and smaller
and even smaller still
R markdown uses the # symbol to create headers. A level one header uses one #,
use ## for a level two header etc. Note how, in the script, the background colour of the row changes.
**Exercise `r n<-n+1; n`:** Turn 'Introduction', 'What is R Markdown?' and 'Formatting'
into level 1 headers. Apply subsequent header levels to the text that follows 'Formatting'.
## Italics, bold and super/sub scripts
The asterisk is used to make text bold or italic (or both). For italics, apply
one * either side of the word/phrase, for bold use ** either side of the word/phrase.
If you want both italics **and** bold, apply ***.
In the same way, apply ^ for superscript and ~~ for strike-through (note the 2 ~~ symbols).
**Exercise `r n<-n+1; n`:**
Make this text italic
make this text bold
Really emphasise this by making it bold and italics
Apply a superscript to part of this text
Strike through this text
## Lists
Just as in Word, you can create bulleted and numbered lists. For bullet points,
use *, - or + at the start of each point to be bulleted. Indent and use + for sub-bullets.
**Exercise `r n<-n+1; n`a:** turn the first list into a bulleted list with sub-bullet,
and the second list into a numbered list.
Asterisk, dash or + sign for bullet points
and other lists
indent for sub-lists
and more points if that's what you want
**Exercise `r n`b:** Create numbered lists by starting each line with (@). The
numbers will dynamically adjust.
markdown does the numbering for you
so there's no chance of mis-counting after an edit
it will automatically update
isn't that clever!
You **can** use numbers to make a numbered list, but you need to add a full-stop
after each number.
1. you can choose to number lists yourself
2. it must start at number 1,
5. but after that it will number consecutively even if you wanted this item to
be numbered e.g. 5
## Paragraphs & Spacing
To create a line break you have to double space after the last sentence then
press return.
Even if you press enter in the markdown script, it won't appear as a new paragraph
in the document unless you end the sentence with a double space.
This line looks like it should be a new paragraph in the markdown code, but it
isn't once it renders to html.
Make text stand out by using block quotes
Preceed each line with >
This can be useful for headline statements
But remember to put two spaces after each sentence
**Exercise `r n<-n+1; n`:** turn the previous four lines into block quote
Use 3 asterisks to create a line break below this sentence.
The asterisk is turning out to be quite a versatile symbol!
## Links
A useful feature for any document that will be viewed electronically is linking
to websites or email addresses. To display the complete URL, wrap it in angled
brackets <>. A good starting point for learning more about R Markdown is <http://rmarkdown.rstudio.com>.
Instead, you might want a clickable link that displays some alternative text. This
uses a combination of square and round brackets - wrap the readable text in square
brackets, and follow it with the URL link in round brackets. [R Markdown: the definitive guide](https://bookdown.org/yihui/rmarkdown/) is another really useful resource.
This is the same method used for creating a clickable email link: `
[text to display](mailto:email@email.address)`
**Exercise `r n<-n+1; n`:**
(@) Contact: (add some alt text and your email address as a hyperlink)
(@) Go back to the Introduction section and make the link to rmarkdown active.
(@) We are going to be using data from the NHS Scotland Open Data platform. Create
some alt text and link it to opendata.nhs.scot
(@) Knit and check your links work.
## Appearance
Your document should be looking more like this one, but it's still missing the
table of contents. We add that in the YAML header. While we are there, we will
change the theme. Choose from any of the [Bootswatch](https://bootswatch.com/)
free themes, or download other templates.
**Exercise `r n<-n+1; n`:** amend your YAML header as below. Note the indentation
after 'output', and further indentation after 'html_document' - in YAML, this is
essential so it knows to link it all back to the html document output. See if you
like any of the other free themes from [Bootswatch](https://bootswatch.com/)
(cerulean, cosmo, cyborg, darkly, flatly, journal, lumen, paper, readable, sandstone,
simplex, slate, spacelab, superhero, united or yeti).
```
output:
html_document:
theme: cosmo
toc: true
toc_float: yes
```
While you are in the YAML header, change the author and date too.
# Dealing with data
Now things start to get really interesting! We are going to import some data, do
some wrangling and create some outputs to display. We will be using R to do all
this, but R Markdown supports other languages too.
There are two ways of inserting code into a document: blocks of code are wrapped
up in a code chunk, but you can also insert objects inline in text. You create a
code chunk using the `Insert` option at the top of the pane, the keyboard shortcut
`Ctrl + Alt + I` or manually, by wrapping it in three backticks at the start and
end of the chunk. You also need to use curly brackets to declare which language the code is in.
````markdown
`r ''````{r}
#| label: add-chunk-name
# insert R code here (without the # symbol)
`r ''````
````
It's good practice to name your code chunks (each chunk needs a unique name),
keep it short but descriptive of what it is doing. Don't use spaces, dots or
underscores in chunks - use "-" if you need to use a separator.
Adding chunk options after the code language can e.g. hide the code, only show
the results, evaluate the code or not (plus many more options). Use the hashtag
and vertical line to declare each option. If you want the same rules to apply to
all code chunks you would include it in the global options.
Look at the very first code chunk in the script - `include: false` excludes the
code chunks from the output by default. Be careful to include a space after the colon -
otherwise it won't knit, and the error message isn't helpful!
Use single backticks to run code inline with text. You still need to include `r`
but this time it doesn't need to be in curly brackets. For example, to print today's
date use `` `r
format(Sys.time(), '%d %B %Y')` `` inline with the rest of your text.
**Exercise `r n<-n+1; n`:**
(@) If you don't already have the package installed, run
`remotes::install_github("Public-Health-Scotland/phsopendata", upgrade = "never")`
in the console. This will install the PHS Open Data package, which we will use to
import some data.
(@) Create a code chunk, name it and paste the following code into the chunk.
When you knit, it doesn't look like anything has happened. But behind the scenes,
the data
has been imported and filtered, and the object `monthly_ae` is waiting to be used.
In the next code chunk we will create some simple plots, and also introduce tabsets.
These are a
useful way to include lots of content without the document becoming excessively long.
## Monthly Attendances at A&E
To create tabsets add {.tabset} to the header row. Then create two (or more, how
many tabs do you want?) headers at the next level down, with a code chunk in each section.
## {.unlisted .unnumbered}
You aren't limited to just displaying charts in tabsets - you might want to also
include the data behind a chart, in a table.
**Exercise `r n<-n+1; n`:**
1. add `{.tabset}` to the *Monthly Attendances at A&E* level 2 header.
2. create a level 3 header, *Attendances at EDs*
3. paste the following code into a code chunk. Add some suitable commentary below
the code chunk
4. repeat with another level 3 header, titled *Attendances at MIUs*
5. If you want to make it look a bit snazzy, add a bit of flair to the tabsets by
adding `.tabset-fade` and/or `.tabset-pills` to `{.tabset}` (keep it all inside the
curly brackets).
# Housekeeping
By now the script is getting quite long, and can be tricky to navigate. If you've
been good at defining sections then you can toggle to show the document outline
on the right-hand side of the script pane. Other points of good practice:
- load all the libraries you need in the first code chunk. (I haven't done that
in this document, as I wanted you to see which package is used in each example)
- similarly, load and wrangle the data early on in the script too, before you start
creating content. Then the code chunks within the main text can be kept sparse,
such as just calling a plot.
- run each chunk in RStudio as you create it to check it works - it's easier to
troubleshoot this way, than following an error message created when trying to knit.
- if there's a lot of processing to be done then create this in another R script
instead, and 'call' it into the markdown by using `source`.
- be very careful using filepaths - the 'start point' of filepaths can be different
when running a chunk locally than when knitting. The easiest way to get round that
is to use the package `here`, which always starts at the location of the project directory.
# Adding some interactivity
It's all well and good being able to print a chart or table, but it's still a static
object. Wouldn't it be great if readers could interact with those objects? The
beauty of rendering the markdown script to html is that we can! What follows are
just a few examples of packages that can add some interactivity to your reports.
There aren't any exercises, but follow along with the script to see how to apply
them. In your script the following code chunks are currently set to not run when
you knit - delete the code `eval = FALSE` from each code chunk (or change it to
`TRUE`) so that the code does run from now on.
## DT (Datatable)
This default output for displaying tables using DT is shown below, but you can
customise it (add options for exporting the data, change how many rows are displayed,
plus lots more). However, you can see it doesn't like the `yearmon` date format,
and displays it as a numeric. (There is probably a workaround...)
```{r}
#| label: datatable
#| eval: false
library(DT)
datatable(monthly_ae %>%
rename("Number Of Attendances" = NumberOfAttendancesAggregate,
"Number Meeting Target" = NumberMeetingTargetAggregate) %>%
mutate(month_end = as.yearmon(month_end)))
```
## Reactable
With reactable, you can group data without having to hard code it first, using
the built-in aggregate functions or define your own. Clicking on a grouped item
will unroll it.
```{r}
#| label: reactable
#| eval: false
#|
library(reactable)
reactable(monthly_ae %>%
rename(Location = TreatmentLocation,
"Number Of Attendances" = NumberOfAttendancesAggregate,
"Number Meeting Target" = NumberMeetingTargetAggregate),
groupBy = c("DepartmentType", "Location"),
columns = list(
month_end = colDef(cell = function(value) strftime(value, "%Y-%b")),
"Number Of Attendances" = colDef(aggregate = "sum"),
"Number Meeting Target" = colDef(aggregate = "sum")
)
)
```
## Plotly
We can apply all sorts of tricks to make our ggplots look fancy, but at the end
of the day they are still just static plots. Plotly can bring them to life. Hovering
over the lines will display data points, and you can zoom into a specific area,
download the plot, plus other options available through the buttons top-right of the plot..
```{r}
#| label: plotly
#| eval: false
library(plotly)
pl_colours <- c("darkred", "steelblue")
plot3 <-monthly_ae_gp %>% filter(type == "Number Of Attendances") %>%
ggplot(., aes(x = month_end, y = count, group = DepartmentType, colour = DepartmentType)) +
geom_line() +
expand_limits(y = 0) +
labs(title = "Number of Attendances at A&E",
x = "month ending",
y = "number of Attendances") +
scale_colour_manual(values = pl_colours) +
theme_minimal()
ggplotly(plot3)
```
Plotly may not be able to render more advanced ggplots using the `ggplotly` function,
but you can build plots from scratch in plotly. The language and terminology is
different than ggplot, but ultimately it gives you a lot more functionality.
# What Next?
## RStudio Visual Editor
Newer versions of RStudio (v1.4 onwards) have a visual editor, which means you can
focus more on the content of your document and less on how to code it. It won't work
with this script so let's see what happens with a fresh R Markdown script.
Go to File > New File > R Markdown... In the pop-up box give the file a name, add
an author and keep the output as HTML. Have a look at the script that is generated.
You should recognise the different aspects of a markdown script: YAML, text, code chunks.
Now look at the top left corner, at the toolbar below the script tabs. This is
where you can switch between the Source and Visual editors.
In Visual Editor mode you can use the editor toolbar to format text, add images,
tables, links, code chunks, citations, footnotes etc. More details can be found in RStudio's
[github page](https://rstudio.github.io/visual-markdown-editing/). The [bookdown
package](https://bookdown.org) adds extra functionality such as cross-referencing.
## Taking it further
This is only a brief introduction to R Markdown, there is so much more to explore!
The suggestions below are outwith the scope of this guide, and are only a tiny
fraction of what is possible.
* Automate the same report for multiple groups (e.g. ICSs, LAs, hospitals, ...).
These are done by defining the parameters. Chapter 15 in [R Markdown: the definitive guide](https://bookdown.org/yihui/rmarkdown/parameterized-reports.html) describes
this in more detail. This method can also be applied to other types of outputs,
such as pdf, Word or Powerpoint documents.
* [Flexdashboard](https://rmarkdown.rstudio.com/flexdashboard/) is an extension
package to markdown that allows you to build dashboards in html. These are useful
to display data visualisations, and include other components such as data tables,
interactive maps, gauges, value boxes etc. It can be made up of multiple pages,
or you can control how analysis and its conclusions are unveiled through the use
of a storyboard.
* Further customise the appearance of the html document by adding CSS styles and
html formatting. For instance (look at the source code to see how this effect is created):
<style>
div.phsblu50 { background-color:#80bcea; border-radius: 5px; padding: 20px;}
</style>
<div class = "phsblu50">
you can change the background colour of a block to further highlight key statements
- and change the <span style="color: #c73918;">**font colour**</span> for specific words/sentences
- making your conclusions really stand out
</div>