-
Notifications
You must be signed in to change notification settings - Fork 113
/
Copy path15-common-app-caveats.Rmd
1000 lines (815 loc) · 41.1 KB
/
15-common-app-caveats.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
# Common Application Caveats {#common-app-caveats}
## Reactivity anti-patterns
### Reactivity is awesome... until it is not
Let's face it, reactivity is awesome... until it is not.
Reactivity is a common source of confusion for beginners, and a common source of bugs and bottlenecks, even for seasoned `{shiny}` developers.
Most of the time, issues come from the fact that **there is too much reactivity**, *i.e.* we build apps where too many things happen, and some things are updated way more often than they should be, and computations are performed when they should not be, and in the end we have a hard time understanding what is really happening inside our application.
Of course, it is a nice feature to make everything react instantly to changes, but when building larger apps it is easy to create monsters, i.e. complicated, messy, reactive graphs where everything is updated too much and too often.
Or worse, we generate endless reactive loops, aka "the reactive inferno" where A invalidates B which invalidates C which invalidates A which invalidates B which invalidates C, and so on.
Let's take a small example of a reactive inferno:
```{r 15-common-app-caveats-1, eval = FALSE}
library(shiny)
library(lubridate)
ui <- function(){
tagList(
# Adding a first input which allow
# to select a specific date
dateInput(
"date",
"choose a date"
),
# Adding a second input allowing
# to specify a year
selectInput(
"year",
"Choose a year",
choices = 2010:2030
)
)
}
server <- function(
input,
output,
session
){
# We want the year to be update whenever
# the dateInput is updated
observeEvent( input$date , {
updateSelectInput(
session,
"year",
selected = year(input$date)
)
})
# We want the date to be update whenever
# the selectInput is updated
observeEvent( input$year , {
updateDateInput(
session,
"date",
value = lubridate::as_date(
sprintf("%s-01-01", input$year)
)
)
})
}
shinyApp(ui, server)
```
Here, we want to handle something pretty common:
- The user can pick a `date` and the `year` input is updated.
- And the other way round: when the `year` input changes, the `date` is updated too.
But if you try to run this in your console, it will end as a reactive inferno: date updates year that updates date that updates year, and so on.
And the more you work on your app, the more complex it gets, and the more you will be likely to end up in a reactive inferno.
In this section, we will deal with reactivity, how to have more control over it, and how to share data across modules without relying on passing along reactive objects.
This application is in this state of infinite loop because it starts in a mutually inconsistent state: the `dateInput()` year value is the current year, while the `selectInput()` value is `2010`.
One way to solve this is to add some extra logic to the app by selecting the current year for `selectInput()`, and adding an `if` statement in the `observeEvent(input$year, {})`, as shown below.[^common_app_caveats-276]
[^common_app_caveats-276]: We want to thank Hadley for his help simplifying this solution <https://github.com/ThinkR-open/engineering-shiny-book/issues/276>.
```{r 15-common-app-caveats-2, eval = FALSE}
library(shiny)
ui <- fluidPage(
dateInput(
"date",
"choose a date"
),
selectInput(
"year",
"Choose a year",
choices = 2010:2030,
# Setting a state for the year
selected = format(
Sys.Date(),
"%Y"
)
)
)
server <- function(input, output, session) {
observeEvent(input$date, {
year <- format(input$date, "%Y")
message("Changing year to ", year)
updateSelectInput(inputId = "year", selected = year)
})
observeEvent(input$year, {
# Preventing this update to be sent at application launch
if (input$year != format(input$date, "%Y")) {
date <- as.Date(ISOdate(input$year, 1, 1))
message("Changing date to ", date)
updateDateInput(inputId = "date", value = date)
}
})
}
shinyApp(ui, server)
```
### `observe` vs `observeEvent`
One of the most common features of reactive inferno is the use of `observe()` in cases where you should use `observeEvent`.
Spoiler: you should try to use `observeEvent()` as much as possible, and avoid `observe()`as much as possible.
At first, `observe()` seems easier to implement, and feels like a shortcut as you do not have to think about what to react to: everything gets updated without you thinking about it.
But the truth is, this stairway does not lead to heaven.
Let's stop and think about `observe()` for a minute.
This function updates **every time a reactive object it contains is invalidated**.
Yes, this works well if you have a small number of reactive objects in the observer, but that gets tricky when you start adding a long list of things inside your `observe()`, as you might be launching a computation 10 times if your reactive scope contains 10 reactive objects that are somehow invalidated in chain.
And believe us, we have seen pieces of code where the `observe()` contains hundreds of lines of code, with reactive objects all over the place, with one `observe()` context being invalidated dozens of times when one input changes in the application.
For example, let's start with that:
```{r 15-common-app-caveats-3, eval = FALSE}
## DO NOT DO GLOBAL VARIABLES, IT'S JUST TO SIMPLIFY THE EXAMPLE
# We initiate a counter that will help to track how many times
# some pieces of the code are called
i <- 0
library(shiny)
library(cli)
ui <- function(){
tagList(
# We are adding a simple text input
# that will be printed to the console
textInput("txt", "Text")
)
}
server <- function(input, output, session){
observe({
# Every time this reactive context is invalidated,
# we add 1 to the i value
i <<- i + 1
# We print the i value to the console,
# and the value of input$txt
cat_rule(as.character(i))
print(input$txt)
})
}
shinyApp(ui, server)
```
Oh, and then, let's add a small `selectInput()`:
```{r 15-common-app-caveats-4, eval = FALSE}
i <- 0
library(shiny)
library(cli)
ui <- function(){
tagList(
# We are adding a simple text input
# that will be printed to the console
textInput("txt", "Text"),
# We add a selectInput() to allow text transformation
selectInput(
"casefolding",
"Casefolding",
c("lower", "upper")
)
)
}
server <- function(input, output, session){
observe({
# Every time this reactive context
# is invalidated, we add 1 to the i value
i <<- i + 1
# We print the i value to the console
cat_rule(as.character(i))
# If the user select lower, then the text is
# passed through tolower, otherwise it's passed
# through toupper
if (input$casefolding == "lower") {
print(tolower(input$txt))
} else {
print(toupper(input$txt))
}
})
}
shinyApp(ui, server)
```
And, as time goes by, we add another control flow to our `observe()`:
```{r 15-common-app-caveats-5, eval = FALSE}
i <- 0
library(shiny)
library(cli)
library(stringi)
ui <- function(){
tagList(
# We are adding a simple text input
# that will be printed to the console
textInput("txt", "Text"),
# We add a selectInput() to allow text transformation
selectInput(
"casefolding",
"Casefolding",
c("lower", "upper")
),
# A new checkbox to reverse (or not) the input text
checkboxInput("rev", "reverse")
)
}
server <- function(input, output, session){
observe({
# Every time this reactive context
# is invalidated, we add 1 to the i value
i <<- i + 1
# We print the i value to the console
cat_rule(as.character(i))
# Use input_txt as a container for our input
input_txt <- input$txt
if (input$rev){
# If the input$rev is select, we reverse the text
input_txt <- stri_reverse(input_txt)
}
# If the user select lower, then the text is
# passed through tolower, otherwise it's passed
# through toupper
if (input$casefolding == "lower") {
print(tolower(input_txt))
} else {
print(toupper(input_txt))
}
})
}
shinyApp(ui, server)
```
And it would be nice to keep the selected values in a reactive list, so that we can reuse it elsewhere.
And maybe you would like to add a checkbox so that the logs are printed to the console only if checked.
```{r 15-common-app-caveats-6, eval = FALSE}
i <- 0
library(shiny)
library(cli)
library(stringi)
ui <- function(){
tagList(
# We are adding a simple text input
# that will be printed to the console
textInput("txt", "Text"),
# We add a selectInput() to allow text transformation
selectInput(
"casefolding",
"Casefolding",
c("lower", "upper")
),
# A new checkbox to reverse (or not) the input text
checkboxInput("rev", "reverse")
)
}
server <- function(input, output, session){
# We are using a reactiveValues to keep this input value
r <- reactiveValues()
observe({
# Every time this reactive context
# is invalidated, we add 1 to the i value
i <<- i + 1
# We print the i value to the console
cat_rule(as.character(i))
if (input$rev){
# If the input$rev is select, we reverse the text
r$input_txt <- stri_reverse(r$input_txt)
} else {
# Otherwise, we leave it as it is
r$input_txt <- input$txt
}
# If the user select lower, then the text is
# passed through tolower, otherwise it's passed
# through toupper
if (input$casefolding == "lower") {
print(tolower(r$input_txt))
} else {
print(toupper(r$input_txt))
}
})
}
shinyApp(ui, server)
```
Ok, now can you tell how many potential invalidation points we have here?
Three: whenever `input$txt`, `input$rev` or `input$casefolding` change.
Of course, three is not that much, but you get the idea.
Let's pause a minute and think about why we use `observe()` here.
To update the values inside `r$input_txt`, yes.
But do we need to use `observe()` for, say, updating `r$input_txt` under dozens of conditions, each time the user types a letter?
Possibly not.
We generally want our observer to update its content under a small, controlled number of inputs, i.e. with a controlled number of invalidation points.
And, what we often forget is that users do not type/select correctly on the first try.
No, they usually try and miss, restart, change things, amplifying the reactivity "over-happening".
Moreover, long `observe()` statements are hard to debug, and they make collaboration harder when the trigger to the observe logic can potentially live anywhere between line one and line 257 of your `observe()`.
That's why (well, in 99% of cases), it is safer to go with `observeEvent`, as it allows you to see at a glance the condition under which the content is invalidated and re-evaluated.
Then, if a reactive context is invalidated, **you know why**.
For example, here is where the reactive invalidation can happen (lines with a `*`)[^common-app-caveats-1]:
[^common-app-caveats-1]: Of course it's an over-simplification: the reactive context will not be invalidated in all of these contexts. The idea is to illustrate how `observe()` can lead to invalidation points that are spread all across the code bloc.
``` {.r}
observe({
i <<- i + 1
cat_rule(as.character(i))
* if (input$rev){
* r$input_txt <- stri_reverse(r$input_txt)
} else {
* r$input_txt <- input$txt
}
* if (input$casefolding == "lower") {
* print(tolower(r$input_txt))
} else {
* print(toupper(r$input_txt))
}
})
```
Whereas in this refactored code using `observeEvent()`, it is easier to identify where the invalidation can happen:
``` {.r}
observeEvent( c(
* input$rev,
* input$txt
),{
i <<- i + 1
cat_rule(as.character(i))
if (input$rev){
r$input_txt <- stri_reverse(r$input_txt)
} else {
r$input_txt <- input$txt
}
if (input$casefolding == "lower") {
print(tolower(r$input_txt))
} else {
print(toupper(r$input_txt))
}
})
```
### Building triggers and watchers
To prevent this, one way to go is to create "flag" objects, which can be thought of as internal buttons to control what you want to invalidate: you create the button, set some places where you want these buttons to invalidate the context, and finally press these buttons.
These objects are launched with an `init` function, then these flags are triggered with `trigger()`, and wherever we want these flags to invalidate a reactive context, we `watch()` these flags.
The idea here is to get full control over the reactive flow: we only invalidate contexts when we want, making the general flow of the app more predictable.
These flags are available using the `{gargoyle}` [@R-gargoyle] package, that can be installed from GitHub with:
```{r 15-common-app-caveats-7, eval = FALSE}
# CRAN version
install.packages("gargoyle")
# Dev version
remotes::install_github("ColinFay/gargoyle")
```
- `gargoyle::init("this")` initiates a `"this"` flag: most of the time you will be generating them at the `app_server()` level.
- `gargoyle::watch("this")` sets the flag inside a reactive context, so that it will be invalidated every time you `trigger("this")` this flag.
- `gargoyle::trigger("this")` triggers the flags.
And, bonus, as these functions use the `session` object, they are available across all modules.
That also means that you can easily trigger an event inside a module from another one.
This pattern is, for example, implemented in `{hexmake}` [@R-hexmake] (though not with `{gargoyle}`), where the rendering of the image on the right is fully controlled by the [`"render"` flag](https://github.com/ColinFay/hexmake/blob/master/R/mod_right.R#L40).
The idea here is to allow complete control over when the image is recomputed: only when `trigger("render")` is called does the app regenerate the image, helping us lower the reactivity of the application.
That might seem like a lot of extra work, but that is definitely worth considering in the long run, as it will help in optimizing the rendering (fewer computations), and lowering the number of errors that can result from too much reactivity inside an application.
Here is a small example of this implementation, using an environment to store the value.
When using this pattern, we do not rely on any reactive value invalidating the reactive context: the second result is only displayed when the `"render2"` flag is triggered, giving us a full control on how the reactivity is propagated.
```{r 15-common-app-caveats-8, eval = FALSE}
library(shiny)
library(gargoyle)
ui <- function(){
fluidPage(
tagList(
# Creating an action button to launch the computation
actionButton("compute", "Compute"),
# Output for all runif()
verbatimTextOutput("result"),
# This output will change only if runif() > 0.5
verbatimTextOutput("result2"),
# This button will reset x$results to 0, we use it
# to show that it won't launch a series of reactivity
# invalidation
actionButton("reset", "Reset x")
)
)
}
server <- function(
input,
output,
session
){
# Mimic an R6 class, i.e. a non-reactive object
x <- environment()
# Creating two watchers
init("render_result", "render_result2")
observeEvent( input$compute , {
# When the user presses compute, we launch runif()
x$results <- runif(1)
# Every time a new value is stored, we render result
trigger("render_result")
# Only render the second result if x$results is over 0.5
if (x$results > 0.5){
trigger("render_result2")
}
})
output$result <- renderPrint({
# Will be rendered every time
watch("render_result")
# require x$results before rendering the output
req(x$results)
x$results
})
output$result2 <- renderPrint({
# This will only be rendered if trigger("render_result2")
# is called
watch("render_result2")
req(x$results)
x$results
})
observeEvent( input$reset , {
# This resets x$results. This code block is here
# to show that reactivity is not triggered in this app
# unless a trigger() is called
x$results <- 0
print(x$results)
})
}
shinyApp(ui, server)
```
### Using R6 as data storage
One pattern we have also been playing with is storing the app business logic inside one or more R6 objects.
Why would we want to do that?
#### A. Sharing data across modules {.unnumbered}
Sharing an R6 object makes it simpler to create data that are shared across modules, but without the complexity generated by reactive objects, and the instability of using global variables.
Basically, the idea is to hold the whole logic of your **data** **reading/cleaning/processing/outputting inside an R6 class**.
An object of this class is then initiated at the top level of your application, and you can pass this object to the sub-modules.
Of course, this makes even more sense if you are combining it with the trigger/watch pattern from before!
```{r 15-common-app-caveats-9, eval = FALSE}
library(shiny)
data_cleaning_ui <- function(id){
ns <- NS(id)
tagList(
# Defining the UI for your first module
# [...]
)
}
mod_data_cleaning_server <- function(id, r6){
moduleServer( id, function(input, output, session){
ns <- session$ns
observeEvent( input$launch_cleaning , {
# Once the launch_cleaning input is triggered, we
# use the internal method from our r6 object
r6$clean(arg1 = input$a, arg2 = input$b)
# Triggering the plot
trigger("plot")
})
})
}
plotting_ui <- function(id){
ns <- NS(id)
tagList(
# Defining the UI for your second module
# [...]
)
}
mod_plotting_server <- function(id, r6){
moduleServer( id, function(input, output, session){
ns <- session$ns
# Rendering, inside this second module, the plot based on the
# cleaning done in the other module
output$plot <- renderPlot({
# We use the trigger/watch pattern from before
watch("plot")
# Calling the plot() method from our R6 object
r6$plot()
})
})
}
ui <- function(){
tagList(
# Putting our two module UIs here
data_cleaning_ui("data_cleaning_ui"),
plotting_ui("plotting_ui")
)
}
server <- function(
input,
output,
session
){
# We start by creating a new instance of th
r6 <- MyDataProcessing$new()
# Passing this object to the two server functions
mod_data_cleaning_server("data_cleaning_ui_1", r6)
mod_plotting_server("plotting_ui_1", r6)
}
shinyApp(ui, server)
```
#### B. Be sure it is tested {.unnumbered}
During the process of building a robust `{shiny}` app, we strongly suggest that you test as many things as you can.
This is where using an R6 for the business logic of your app makes sense: this allows you to build the whole testing of your application data logic outside of any reactive context: you simply build unit tests just as any other function.
For example, let's say we have the following R6 generator:
```{r 15-common-app-caveats-10}
MyData <- R6::R6Class(
"MyData",
# Defining our public methods, that will be
# the dataset container, and a summary function
public = list(
data = NULL,
initialize = function(data){
self$data <- data
},
summarize = function(){
summary(self$data)
}
)
)
```
We can then build a test for this class using `{testthat}`:
```{r 15-common-app-caveats-11}
library(testthat, warn.conflicts = FALSE)
test_that("R6 Class works", {
# We define a new instance of this class, that will contain
# the mtcars data.frame
my_data <- MyData$new(mtcars)
# We will expect my_data to have two classes:
# "MyData" and "R6"
expect_is(my_data, "MyData")
expect_is(my_data, "R6")
# And the summarize method to return a table
expect_is(my_data$summarize(), "table")
# We would expect the data contained in the object
# to match the one taken as input to new()
expect_equal(my_data$data, mtcars)
# And the summarize method to be equal to the summary()
# on the input object
expect_equal(my_data$summarize(), summary(mtcars))
})
```
Using R6 allows to rely on these battle-tested tools when it comes to testing functions, something which is made more complex when using other patterns like `reactiveValues()`.
### Logging reactivity with `{whereami}`
Getting a good sense of how reactivity is actually working in your app is not an easy task: the reactivity logic is a graph, and it happens very quickly when you run the app, so it's very hard to follow everything.
`whereami::whereami()`'s [@R-whereami] goal is simple: informing you about where it is called, i.e. from what file and at which line, and how many times.
For example, if you add the following piece of code to your `app_server()`, the location of the function call will be printed to the logs.
```{r 15-common-app-caveats-12, eval = FALSE}
whereami::cat_where( whereami::whereami() )
```
── Running server(...) at app_server.R#9 (2) ───────────────
Combining `cat_where()` will implement a reactive logging to your console while developing: that way, you can instantaneously know what reactive contexts are invalidated while using the application.
Of course, you still have to implement it by hand, but that is definitely worth the effort: seeing in real time, in your console, which line is run allows you to detect unexpected behavior.
For example, you will be able to see that the `observeEvent()` from `mod_main.R#79` has been called 17 times when launching the app, which might be an unexpected behavior.
The screenshot in Figure \@ref(fig:15-common-app-caveats-13) shows what a `{whereami}` log might look like, here, for the `{hexmake}` application.
(ref:whereami) `{whereami}` output for `{hexmake}`.
```{r 15-common-app-caveats-13, echo=FALSE, fig.cap="(ref:whereami)", out.width="100%"}
knitr::include_graphics("img/whereami.png")
```
And bonus, once the app is closed, you can get a list of all the "counters" with `whereami::counter_get()`, and how many times they each have been called, and `plot(whereami::counter_get())` will draw a raw plot of the various counters, as shown in Figure \@ref(fig:15-common-app-caveats-14).
(ref:whereamiplot) plot of `{whereami}` counters.
```{r 15-common-app-caveats-14, echo=FALSE, fig.cap="(ref:whereamiplot)", out.width="100%"}
knitr::include_graphics("img/plot_whereami.png")
```
\newpage
## R does too much
### Rendering the UI from the server side
There are many reasons we would want to change things on the UI based on what happens in the server: changing the choices of a `selectInput()` based on the columns of a table which is uploaded by the user, showing and hiding pieces of the app according to an environment variable, allowing the user to create an indeterminate number of inputs, etc.
Chances are that to do that, you have been using the `uiOutput()` and `renderUI()` functions from `{shiny}` [@R-shiny].
Even if convenient, and the functions of choice in some specific context, this pair of functions makes R do a little bit too much: you are making R regenerate the whole UI component instead of changing only what you need, which can be a suboptimal, be it from the user point of view, or from a developer perspective.
**One of the instance in which this pattern might not be optimal is in the case where your visitors do not have an high-speed internet or when visiting and using a smartphone, contexts where every byte counts**.
Rendering large elements from the server side in your `{shiny}` app means that these elements will have to transit through the socket, i.e. they need to be sent by the server, and downloaded by the browser.
In this case, the smaller the message size the better!
From the developer perspective, you will create code that is harder to reason about, as we are used to having the UI parts in the UI functions (but that is not related to performance).
Here are three strategies to code without `uiOutput()` and `renderUI()`.
#### A. Implement UI events in JavaScript {.unnumbered}
> Mixing languages is better than writing everything in one, if and only if using only that one is likely to overcomplicate the program.\
>
> _The Art of UNIX Programming_ [@ericraymond2003]
We will see in the last chapter of this book how you can integrate JS inside your `{shiny}` app, and how even basic functions can be useful for making your app server smaller.
For example, compare:
```{r 15-common-app-caveats-15, eval = FALSE}
library(shiny)
ui <- function(){
tagList(
# Adding a button with an onclick event,
# that will show or hide the plot
actionButton(
"change",
"show/hide graph",
# The toggle() function hide or show the queried element
onclick = "$('#plot').toggle()"
),
plotOutput("plot")
)
}
server <- function(
input,
output,
session
){
output$plot <- renderPlot({
# This renderPlot will only be called once
cli::cat_rule("Rendering plot")
plot(iris)
})
}
shinyApp(ui, server)
```
to
```{r 15-common-app-caveats-16, eval = FALSE}
library(shiny)
ui <- function(){
tagList(
# We use a pattern without JavaScript
actionButton("change", "show/hide graph"),
plotOutput("plot")
)
}
server <- function(
input,
output,
session
){
output$plot <- renderPlot({
# Here, every time the button is clicked, this reactive
# context will be invalidated, and the code re-evaluated
cli::cat_rule("Rendering plot")
# Simulate a show and hide pattern
req(input$change %% 2 == 0)
plot(iris)
})
}
shinyApp(ui, server)
```
The result is the same, but the first version is shorter and easier to understand: we have one button, and the behavior of the button is self-contained.
The second solution redraws the plot every time the `reactiveValues` is updated, making R compute way more than it should, whereas with the JavaScript-only solution, the plot is not recomputed every time you need to show it: the plot is drawn by R only once.
At a local level, the improvements described in this section will not make your application way faster: for example, rendering UI elements (let's say rendering a simple title) will not be computationally heavy.
But at a global level, less UI computation from the server side helps the general rendering of the app: let's say you have an output that takes 3 seconds to run, then if the whole UI + output is to be rendered on the server side, the whole UI stays blank until everything is computed.
Compare:
```{r 15-common-app-caveats-17, eval = FALSE}
library(shiny)
ui <- function(){
tagList(
# We make the whole UI be generated by R
uiOutput("caption")
)
}
server <- function(
input,
output,
session
){
output$caption <- renderUI({
# Simulate something that takes 3 seconds to run
Sys.sleep(3)
# Returning the UI
tagList(
h3("test"),
shinipsum::random_text(10)
)
})
}
shinyApp(ui, server)
```
to
```{r 15-common-app-caveats-18, eval = FALSE}
library(shiny)
ui <- function(){
tagList(
# Only the text input will be rendered by R
h3("test"),
textOutput("caption")
)
}
server <- function(
input,
output,
session
){
output$caption <- renderText({
# Here, we only render the text, not the whole UI
Sys.sleep(3)
shinipsum::random_text(10)
})
}
shinyApp(ui, server)
```
In the first example, the UI will wait for the server to have rendered, while in the second we will first see the title, then the rendered text after a few seconds.
That approach makes the user experience better: they know that something is happening, while a completely blank page is confusing.
Also, because R is single threaded, manipulating DOM elements from the server side causes R to be busy doing these DOM manipulations while it could be computing something else.
And let's imagine it takes a quarter of a second to render the DOM element.
That is a full second for rendering four of them, while R should be busy doing something else!
#### B. `update*` inputs {.unnumbered}
Almost every `{shiny}` input, even the custom ones from packages, come with an `update_` function that allows us to change the input values from the server side, instead of re-creating the UI entirely.
For example, here is a way to update the content of a `selectInput` from the server side:
```{r 15-common-app-caveats-19, eval = FALSE}
library(shiny)
ui <- function(){
tagList(
# We start the selectInput empty
selectInput("species", "Species", choices = NULL),
# The selectInput will be populate
# when the update button is pressed
actionButton("update", "Update")
)
}
server <- function(
input,
output,
session
){
observeEvent( input$update , {
# Update the selectInput with the species from iris
spc <- unique(iris$Species)
updateSelectInput(
session,
"species",
choices = spc,
selected = spc[1]
)
})
}
shinyApp(ui, server)
```
This switch to `updateSelectInput` makes the code easier to reason about as the `selectInput` is where it should be: inside the UI, instead of another pattern where we would use `renderUI()` and `uiOutput()`.
Plus, with the `update` method, we are only changing what is needed, not re-generating the whole input.
#### C. `insertUI` and `removeUI` {.unnumbered}
Another way to dynamically change what is in the UI is with `insertUI()` and `removeUI()`.
It is more global than the solution we have seen before with setting the `reactiveValue` to `NULL` or to a value, as it allows us to target a larger UI element: we can insert or remove the whole input, instead of having the DOM element inserted but empty.
This method allows us to have a smaller DOM: `<div>` that are not rendered are not generated empty, they are simply not there.
Two things to note concerning this method, though:
- Removing an element from the app will not delete the input from the input list. In other words, if you have `selectInput("x", "x")`, and you remove this input using `removeUI()`, you will still have `input$x` in the server.
For example, in the following example, the `input$val` value will not be removed once you have called `removeUI(selector = "#val")`.
```{r 15-common-app-caveats-20, eval = FALSE}
library(shiny)
ui <- function(){
tagList(
# Creating a text input that will be removed
# from the UI whenever the remove button is pressed
textInput("value", "Value", "place"),
actionButton("remove", "Remove UI")
)
}
server <- function(
input,
output,
session
){
observeEvent( input$remove , {
# When the button is pressed,
# the textInput will be removed from the UI
removeUI(selector = "#value")
})
observe({
# We observe input$value every second.
# You'll realize that even after the UI
# is removed, input$value is still available.
invalidateLater(1000)
print(input$value)
})
}
shinyApp(ui, server)
```
- Both these functions take a `jQuery` selector to select the element in the UI. We will introduce these selectors in Chapter \@ref(using-javascript).
### Too much data in memory
If you are building a `{shiny}` application, there is a great chance you are building it to analyze data.
If you are dealing with large datasets, **you should consider deporting the data handling and computation to an external database system: for example, to an SQL database**.
Why?
Because these systems have been created to handle and manipulate data on disk: in other words, it will allow you to perform operations on your data without having to clutter R memory with a large dataset.
For example, if you have a `selectInput()` that is used to perform a filter on a dataset, you can do that filter straight inside SQL, instead of bringing all the data to R and then doing the filter.
That is even more necessary if you are building the app for a large number of users: for example if one `{shiny}` session takes up to 300MB, multiply that by the number of users that will need one session, and you will have a rough estimate of how much RAM you will need.
On the contrary, if you reduce the data manipulation so that it is done by the back-end, you will have, let's say, one database with 300MB of data, so the database size will remain (more or less constant), and the only RAM used by `{shiny}` will be the data manipulation, not the data storage.
That's even more true now that almost any operation you can do today in `{dplyr}` [@R-dplyr] would be doable with an SQL back-end, and that is the purpose of the `{dbplyr}` [@R-dbplyr] package: translates `{dplyr}` code into SQL.
If using a database as a back-end seems a little bit far-fetched right now, that is how it is done in most programming languages: if you are building a web app with NodeJS or Python for example, and need to interact with data, nothing will be stored in RAM: you will be relying on an external database to store your data.
Then your application will be used to make queries to this database back-end.
## Reading data
`{shiny}` applications are a tool of choice when it comes to analyzing data.
But that also means that these data have to be imported/read at some point in time, and reading data can be time consuming.
How can we optimize that?
In this section, we will take a look at three strategies: including datasets inside your application, using R packages for fast data reading, and when and why you should move to an external database system.
### Including data in your application
If you are building your application using the `{golem}` [@R-golem] framework, you are building your application as a package.
R packages provide a way to include internal datasets, which can then be used as objects inside your app.
This is the solution you should go for if your data are never to rarely updated: the datasets are created during package development, then included inside the build of your package.
The plus side of this approach is that it makes the data fast to read, as they are serialized as R native objects.
To include data inside your application, you can use the `usethis::use_data_raw( name = "my_dataset", open = FALSE )` command, which is inside the `02_dev.R` script inside the `dev/` folder of your source application (if you are building the app with `{golem}`).
This will create a folder called `data-raw` at the root of your application folder, with a script to prepare your dataset.
Here, you can read the data, modify it if necessary, and then save it with `usethis::use_data(my_dataset)`.
Once this is done, you will have access to the `my_dataset` object inside your application.
This is, for example, what is done in the `{tidytuesday201942}` [@R-tidytuesday201942] application, in [data-raw/big\_epa\_cars.R](https://github.com/ColinFay/tidytuesday201942/blob/master/data-raw/big_epa_cars.R): the CSV data are read there, and then used as an internal dataset inside the application.
### Reading external datasets
Other applications use data that are not available at build time: they are created to analyze data that are uploaded by users, or maybe they are fetched from an external service while using the app (for example, by calling an API).
When you are building an application for the "user data" use case, the first thing you will need is to provide users a way to upload their dataset: `shiny::fileInput()`.
One crucial thing to keep in mind when it comes to using user-uploaded files is that you have to be (very) strict with the way you handle files:
- Always specify what type of file you want: `shiny::fileInput()` has an `accept` parameter that allows you to set one or more [MIME types](https://en.wikipedia.org/wiki/Media_type) or extensions. When using this argument (for example, with `text/csv`, `.csv`, or `.xlsx`), the user will only be able to select a subset of files from their computer: the ones that match the type.
- Always perform checks once the file is uploaded, even more if it is tabular data: column type, naming, empty rows, etc. The more you check the file for potential errors, the less your application is likely to fail to analyze this uploaded dataset.
- If the data reading takes a while, do not forget to add a visual progression cue: a `shiny::withProgress()` or tools from the [`{waiter}`](https://github.com/JohnCoene/waiter) package.
Whenever you offer a user the possibility to upload anything, you can be sure that at some point, they will upload a file that will make the app crash.
By setting a specific MIME type and by doing a series of checks once the file is uploaded, you will make your application more stable.
Finally, having a visual cue that "something is happening" is very important for the user experience, because "something is happening" is better than not knowing what is happening, and it may also prevent the user from clicking again and again on the upload button, or worse, they will stop using the app.
Now that we have our `fileInput()` set, how do we read these data as fast as possible?
There are several options depending on the type of data you are reading.
Here are some packages that can make the file reading faster:
- For a tabular, flat dataset (typically csv, tsv, or text), `{vroom}` [@R-vroom] can read data at a 1.40 GB/sec speed. The `fread()` function from `{data.table}` [@R-data.table] is also fast at reading delimited files.
- For JSON files, `{jsonlite}` [@jsonlite2014]. Or more recently, `{RcppSimdJSON}` [@R-RcppSimdJson], which is a binding to the `simdjson` C++ library.
- If you need to read Excel files inside your app, `{readxl}` [@R-readxl] offers a binding to the [`RapidXML`](http://rapidxml.sourceforge.net/) C++ library, which reads Excel files fast.
- Most files exported from statistical software (SAS, SPSS, etc.) can be read using either the `{foreign}` [@R-foreign] or `{haven}` [@R-haven] packages.
### Using external databases
Another type of data analyzed in a shiny application is data that is contained inside an external database.
Databases are heavily used in the data science world and in software engineering as a whole.
Databases come with APIs and drivers that help retrieve and transfer data: be it SQL, NoSQL, or even a graph.
Using a database is one of the solutions for making your app smaller and more efficient in the long run, especially if you need to scale your app to thousands of visitors.
Indeed, **if you plan on having your app scale to numerous people, that will mean that a lot of R processes will be triggered. And if your data is contained in your app, this will mean that each R process will take a significant amount of RAM if the dataset is large**.
For example, if your dataset alone takes \~300 MB of RAM, that means that if you want to launch the app 10 times, you will need \~3GB of RAM.
On the other hand, if you decide to switch these data to an external database, it will lower the global RAM need: the DB will take these 300MB of data, and each shiny application will make a request to the database.
For instance, if the database needs 300MB, and one shiny app 50MB, then 10 apps will be 300MB (for the DB) + 50MB \* 10 (for the 10 apps).
In practice, other things are to be considered: making database requests can be computationally expensive, and might need some network adjustments, but you get the idea.
How does one choose between database back-end?
Well, first of all you need to see what is available in the environment the application will be deployed: maybe the company you are building the application for already has database servers deployed.
If ever you are free to choose any database as a back-end, your choice should be driven by what kind of operations you want to make on these databases.
**For example, SQL databases are designed to store tabular data, and they tend to be very fast when it comes to reading data: so if you have one or more large data.frames you want to use inside your application, and with no specific update of these data, an SQL back-end can be the perfect choice**.
On the other hand, a NoSQL database like MongoDB will be faster when it comes to doing write operations, and can store any kind of object: for example, `{hexmake}` can use a MongoDB back-end to store RDS files.
But that comes with a price: read calls are a little bit slower, and you might have to work a little bit more on handling the JSON results that come out of MongoDB.
Another example of an app that uses on an external database is `{databasedemo}`, available at [engineering-shiny.org/databasedemo/](https://engineering-shiny.org/databasedemo/).
Feel free to follow this link for more information about this application!
Covering all the available types of databases and the packages associated with each is a very, very large topic: there are dozens of database systems, and as many (if not more) packages to interact with them.
For more extensive coverage of using databases in R, please follow these resources:
- [Databases using R](https://db.rstudio.com/), the official RStudio documentation around databases and R.
- [colinfay/r-db](https://colinfay.me/r-db/), a Docker image that bundles the toolchain for a lot of database systems for R.
- [CRAN Task View: Databases with R](https://cran.r-project.org/web/views/Databases.html): the official task view from CRAN with a series of packages for database manipulation
### Data-source checklist
How to choose between these three methodologies:
```{r 15-common-app-caveats-21, echo= FALSE}
knitr::kable(
data.frame(
Choice = c("Package data", "Reading files", "External DataBase"),
Update = c("Never to very rare", "Uploaded by Users", "Never to Streaming"),
Size = c("Low to medium", "Preferably low", "Low to Big")
)
)
```