15-common-app-caveats.Rmd

# Common Application Caveats {#common-app-caveats}

## Reactivity anti-patterns

### Reactivity is awesome... until it is not

Let's face it, reactivity is awesome... until it is not.
Reactivity is a common source of confusion for beginners, and a common source of bugs and bottlenecks, even for seasoned `{shiny}` developers.
Most of the time, issues come from the fact that **there is too much reactivity**, *i.e.* we build apps where too many things happen, and some things are updated way more often than they should be, and computations are performed when they should not be, and in the end we have a hard time understanding what is really happening inside our application.

Of course, it is a nice feature to make everything react instantly to changes, but when building larger apps it is easy to create monsters, i.e. complicated, messy, reactive graphs where everything is updated too much and too often.
Or worse, we generate endless reactive loops, aka "the reactive inferno" where A invalidates B which invalidates C which invalidates A which invalidates B which invalidates C, and so on.

Let's take a small example of a reactive inferno:

```{r 15-common-app-caveats-1, eval = FALSE}
library(shiny)
library(lubridate)
ui <- function(){
  tagList(
    # Adding a first input which allow 
    # to select a specific date
    dateInput(
      "date", 
      "choose a date"
    ), 
    # Adding a second input allowing 
    # to specify a year
    selectInput(
      "year", 
      "Choose a year", 
      choices = 2010:2030
    )
  )
}

server <- function(
  input, 
  output, 
  session
){
  # We want the year to be update whenever 
  # the dateInput is updated
  observeEvent( input$date , {
    updateSelectInput(
      session, 
      "year", 
      selected = year(input$date)
    )
  })
  
  # We want the date to be update whenever 
  # the selectInput is updated
  observeEvent( input$year , {
    updateDateInput(
      session, 
      "date", 
      value = lubridate::as_date(
        sprintf("%s-01-01", input$year)
      )
    )
  })
  
}

shinyApp(ui, server)
```

Here, we want to handle something pretty common:

-   The user can pick a `date` and the `year` input is updated.
-   And the other way round: when the `year` input changes, the `date` is updated too.

But if you try to run this in your console, it will end as a reactive inferno: date updates year that updates date that updates year, and so on.

And the more you work on your app, the more complex it gets, and the more you will be likely to end up in a reactive inferno.
In this section, we will deal with reactivity, how to have more control over it, and how to share data across modules without relying on passing along reactive objects.

This application is in this state of infinite loop because it starts in a mutually inconsistent state: the `dateInput()` year value is the current year, while the `selectInput()` value is `2010`. 
One way to solve this is to add some extra logic to the app by selecting the current year for `selectInput()`, and adding an `if` statement in the `observeEvent(input$year, {})`, as shown below.[^common_app_caveats-276]

[^common_app_caveats-276]: We want to thank Hadley for his help simplifying this solution <https://github.com/ThinkR-open/engineering-shiny-book/issues/276>.

```{r 15-common-app-caveats-2, eval = FALSE}
library(shiny)
ui <- fluidPage(
  dateInput(
    "date", 
    "choose a date"
    ), 
  selectInput(
    "year", 
    "Choose a year", 
    choices = 2010:2030, 
    # Setting a state for the year
    selected = format(
      Sys.Date(), 
      "%Y"
    )
  )
)

server <- function(input, output, session) {
  observeEvent(input$date, {
    year <- format(input$date, "%Y")
    message("Changing year to ", year)
    updateSelectInput(inputId = "year", selected = year)
  })
  
  observeEvent(input$year, {
    # Preventing this update to be sent at application launch
    if (input$year != format(input$date, "%Y")) {
      date <- as.Date(ISOdate(input$year, 1, 1))
      message("Changing date to ", date)
      updateDateInput(inputId = "date", value = date)
    }
  })
}

shinyApp(ui, server)
```

### `observe` vs `observeEvent`

One of the most common features of reactive inferno is the use of `observe()` in cases where you should use `observeEvent`.
Spoiler: you should try to use `observeEvent()` as much as possible, and avoid `observe()`as much as possible.

At first, `observe()` seems easier to implement, and feels like a shortcut as you do not have to think about what to react to: everything gets updated without you thinking about it.
But the truth is, this stairway does not lead to heaven.

Let's stop and think about `observe()` for a minute.
This function updates **every time a reactive object it contains is invalidated**.
Yes, this works well if you have a small number of reactive objects in the observer, but that gets tricky when you start adding a long list of things inside your `observe()`, as you might be launching a computation 10 times if your reactive scope contains 10 reactive objects that are somehow invalidated in chain.
And believe us, we have seen pieces of code where the `observe()` contains hundreds of lines of code, with reactive objects all over the place, with one `observe()` context being invalidated dozens of times when one input changes in the application.

For example, let's start with that:

```{r 15-common-app-caveats-3, eval = FALSE}
## DO NOT DO GLOBAL VARIABLES, IT'S JUST TO SIMPLIFY THE EXAMPLE
# We initiate a counter that will help to track how many times 
# some pieces of the code are called
i <- 0
library(shiny)
library(cli)
ui <- function(){
  tagList(
    # We are adding a simple text input 
    # that will be printed to the console
    textInput("txt", "Text")
  )
}

server <- function(input, output, session){
  observe({
    # Every time this reactive context is invalidated, 
    # we add 1 to the i value
    i <<- i + 1
    # We print the i value to the console, 
    # and the value of input$txt
    cat_rule(as.character(i))
    print(input$txt)
  })
}

shinyApp(ui, server)
```

Oh, and then, let's add a small `selectInput()`:

```{r 15-common-app-caveats-4, eval = FALSE}
i <- 0
library(shiny)
library(cli)
ui <- function(){
  tagList(
    # We are adding a simple text input 
    # that will be printed to the console
    textInput("txt", "Text"),
    # We add a selectInput() to allow text transformation
    selectInput(
      "casefolding", 
      "Casefolding", 
      c("lower", "upper")
    )
  )
}

server <- function(input, output, session){
  observe({
    # Every time this reactive context 
    # is invalidated, we add 1 to the i value
    i <<- i + 1
    # We print the i value to the console
    cat_rule(as.character(i))
    # If the user select lower, then the text is 
    # passed through tolower, otherwise it's passed 
    # through toupper
    if (input$casefolding == "lower") {
      print(tolower(input$txt))
    } else  {
      print(toupper(input$txt))
    }
  })
}

shinyApp(ui, server)
```

And, as time goes by, we add another control flow to our `observe()`:

```{r 15-common-app-caveats-5, eval = FALSE}
i <- 0
library(shiny)
library(cli)
library(stringi)
ui <- function(){
  tagList(
    # We are adding a simple text input 
    # that will be printed to the console
    textInput("txt", "Text"),
    # We add a selectInput() to allow text transformation
    selectInput(
      "casefolding", 
      "Casefolding", 
      c("lower", "upper")
    ),
    # A new checkbox to reverse (or not) the input text
    checkboxInput("rev", "reverse")
  )
}

server <- function(input, output, session){
  observe({
    # Every time this reactive context 
    # is invalidated, we add 1 to the i value
    i <<- i + 1
    # We print the i value to the console
    cat_rule(as.character(i))
    # Use input_txt as a container for our input
    input_txt <- input$txt
    if (input$rev){
      # If the input$rev is select, we reverse the text
      input_txt <- stri_reverse(input_txt)
    }
    # If the user select lower, then the text is 
    # passed through tolower, otherwise it's passed 
    # through toupper
    if (input$casefolding == "lower") {
      print(tolower(input_txt))
    } else  {
      print(toupper(input_txt))
    }
  })
}

shinyApp(ui, server)
```

And it would be nice to keep the selected values in a reactive list, so that we can reuse it elsewhere.
And maybe you would like to add a checkbox so that the logs are printed to the console only if checked.

```{r 15-common-app-caveats-6, eval = FALSE}
i <- 0
library(shiny)
library(cli)
library(stringi)
ui <- function(){
  tagList(
    # We are adding a simple text input 
    # that will be printed to the console
    textInput("txt", "Text"),
    # We add a selectInput() to allow text transformation
    selectInput(
      "casefolding", 
      "Casefolding", 
      c("lower", "upper")
    ),
    # A new checkbox to reverse (or not) the input text
    checkboxInput("rev", "reverse")
  )
}

server <- function(input, output, session){
  # We are using a reactiveValues to keep this input value
  r <- reactiveValues()
  observe({
    # Every time this reactive context 
    # is invalidated, we add 1 to the i value
    i <<- i + 1
    # We print the i value to the console
    cat_rule(as.character(i))
    if (input$rev){
      # If the input$rev is select, we reverse the text
      r$input_txt <- stri_reverse(r$input_txt)
    } else {
      # Otherwise, we leave it as it is
      r$input_txt <- input$txt
    }
    # If the user select lower, then the text is 
    # passed through tolower, otherwise it's passed 
    # through toupper
    if (input$casefolding == "lower") {
      print(tolower(r$input_txt))
    } else  {
      print(toupper(r$input_txt))
    }
  })
}

shinyApp(ui, server)
```

Ok, now can you tell how many potential invalidation points we have here?
Three: whenever `input$txt`, `input$rev` or `input$casefolding` change.
Of course, three is not that much, but you get the idea.

Let's pause a minute and think about why we use `observe()` here.
To update the values inside `r$input_txt`, yes.
But do we need to use `observe()` for, say, updating `r$input_txt` under dozens of conditions, each time the user types a letter?
Possibly not.

We generally want our observer to update its content under a small, controlled number of inputs, i.e. with a controlled number of invalidation points.
And, what we often forget is that users do not type/select correctly on the first try.
No, they usually try and miss, restart, change things, amplifying the reactivity "over-happening".

Moreover, long `observe()` statements are hard to debug, and they make collaboration harder when the trigger to the observe logic can potentially live anywhere between line one and line 257 of your `observe()`.
That's why (well, in 99% of cases), it is safer to go with `observeEvent`, as it allows you to see at a glance the condition under which the content is invalidated and re-evaluated.
Then, if a reactive context is invalidated, **you know why**.
For example, here is where the reactive invalidation can happen (lines with a `*`)[^common-app-caveats-1]:

[^common-app-caveats-1]: Of course it's an over-simplification: the reactive context will not be invalidated in all of these contexts. The idea is to illustrate how `observe()` can lead to invalidation points that are spread all across the code bloc.

``` {.r}
observe({
    i <<- i + 1
    cat_rule(as.character(i))
*   if (input$rev){
*     r$input_txt <- stri_reverse(r$input_txt)
    } else {
*     r$input_txt <- input$txt
    }
*   if (input$casefolding == "lower") {
*     print(tolower(r$input_txt))
    } else  {
*     print(toupper(r$input_txt))
    }
})
```

Whereas in this refactored code using `observeEvent()`, it is easier to identify where the invalidation can happen:

``` {.r}
observeEvent( c(
*  input$rev, 
*  input$txt
),{
  i <<- i + 1
  cat_rule(as.character(i))
  if (input$rev){
    r$input_txt <- stri_reverse(r$input_txt)
  } else {
    r$input_txt <- input$txt
  }
  if (input$casefolding == "lower") {
    print(tolower(r$input_txt))
  } else  {
    print(toupper(r$input_txt))
  }
})
```

### Building triggers and watchers

To prevent this, one way to go is to create "flag" objects, which can be thought of as internal buttons to control what you want to invalidate: you create the button, set some places where you want these buttons to invalidate the context, and finally press these buttons.

These objects are launched with an `init` function, then these flags are triggered with `trigger()`, and wherever we want these flags to invalidate a reactive context, we `watch()` these flags.

The idea here is to get full control over the reactive flow: we only invalidate contexts when we want, making the general flow of the app more predictable.
These flags are available using the `{gargoyle}` [@R-gargoyle] package, that can be installed from GitHub with:

```{r 15-common-app-caveats-7, eval = FALSE}
# CRAN version 
install.packages("gargoyle")
# Dev version
remotes::install_github("ColinFay/gargoyle")
```

-   `gargoyle::init("this")` initiates a `"this"` flag: most of the time you will be generating them at the `app_server()` level.

-   `gargoyle::watch("this")` sets the flag inside a reactive context, so that it will be invalidated every time you `trigger("this")` this flag.

-   `gargoyle::trigger("this")` triggers the flags.

And, bonus, as these functions use the `session` object, they are available across all modules.
That also means that you can easily trigger an event inside a module from another one.

This pattern is, for example, implemented in `{hexmake}` [@R-hexmake] (though not with `{gargoyle}`), where the rendering of the image on the right is fully controlled by the [`"render"` flag](https://github.com/ColinFay/hexmake/blob/master/R/mod_right.R#L40).
The idea here is to allow complete control over when the image is recomputed: only when `trigger("render")` is called does the app regenerate the image, helping us lower the reactivity of the application.
That might seem like a lot of extra work, but that is definitely worth considering in the long run, as it will help in optimizing the rendering (fewer computations), and lowering the number of errors that can result from too much reactivity inside an application.

Here is a small example of this implementation, using an environment to store the value.
When using this pattern, we do not rely on any reactive value invalidating the reactive context: the second result is only displayed when the `"render2"` flag is triggered, giving us a full control on how the reactivity is propagated.

```{r 15-common-app-caveats-8, eval = FALSE}
library(shiny)
library(gargoyle)
ui <- function(){
  fluidPage(
    tagList(
      # Creating an action button to launch the computation
      actionButton("compute", "Compute"), 
      # Output for all runif()
      verbatimTextOutput("result"), 
      # This output will change only if runif() > 0.5
      verbatimTextOutput("result2"),
      # This button will reset x$results to 0, we use it 
      # to show that it won't launch a series of reactivity 
      # invalidation
      actionButton("reset", "Reset x")
    )
  )
}

server <- function(
  input, 
  output, 
  session
){
  
  # Mimic an R6 class, i.e. a non-reactive object
  x <- environment()
  
  # Creating two watchers
  init("render_result", "render_result2")
  
  observeEvent( input$compute , {
    # When the user presses compute, we launch runif()
    x$results <- runif(1)
    # Every time a new value is stored, we render result
    trigger("render_result")
    # Only render the second result if x$results is over 0.5
    if (x$results > 0.5){
      trigger("render_result2")
    }
  })
  
  output$result <- renderPrint({
    # Will be rendered every time
    watch("render_result")
    # require x$results before rendering the output
    req(x$results)
    x$results
  })
  
  
  output$result2 <- renderPrint({
    # This will only be rendered if trigger("render_result2")
    # is called
    watch("render_result2")
    req(x$results)
    x$results
  })
  
  observeEvent( input$reset , {
    # This resets x$results. This code block is here
    # to show that reactivity is not triggered in this app 
    # unless a trigger() is called
    x$results <-  0
    print(x$results)
  })
  
}

shinyApp(ui, server)

```

### Using R6 as data storage

One pattern we have also been playing with is storing the app business logic inside one or more R6 objects.
Why would we want to do that?

#### A. Sharing data across modules {.unnumbered}

Sharing an R6 object makes it simpler to create data that are shared across modules, but without the complexity generated by reactive objects, and the instability of using global variables.

Basically, the idea is to hold the whole logic of your **data** **reading/cleaning/processing/outputting inside an R6 class**.
An object of this class is then initiated at the top level of your application, and you can pass this object to the sub-modules.
Of course, this makes even more sense if you are combining it with the trigger/watch pattern from before!

```{r 15-common-app-caveats-9, eval = FALSE}
library(shiny)
data_cleaning_ui <- function(id){
  ns <- NS(id)
  tagList(
    # Defining the UI for your first module
    # [...]
  )
}

mod_data_cleaning_server <- function(id, r6){
  moduleServer( id, function(input, output, session){
    ns <- session$ns
    observeEvent( input$launch_cleaning , {
      # Once the launch_cleaning input is triggered, we 
      # use the internal method from our r6 object 
      r6$clean(arg1 = input$a, arg2 = input$b)
      # Triggering the plot
      trigger("plot")
    })
  })
}

plotting_ui <- function(id){
  ns <- NS(id)
  tagList(
    # Defining the UI for your second module
    # [...]
  )
}

mod_plotting_server <- function(id, r6){
  moduleServer( id, function(input, output, session){
    ns <- session$ns
    # Rendering, inside this second module, the plot based on the 
    # cleaning done in the other module
    output$plot <- renderPlot({
      # We use the trigger/watch pattern from before
      watch("plot")
      # Calling the plot() method from our R6 object
      r6$plot()
    })
  })
}


ui <- function(){
  tagList(
    # Putting our two module UIs here
    data_cleaning_ui("data_cleaning_ui"),
    plotting_ui("plotting_ui")
  )
}

server <- function(
  input, 
  output, 
  session
){
  # We start by creating a new instance of th
  r6 <- MyDataProcessing$new()
  # Passing this object to the two server functions
  mod_data_cleaning_server("data_cleaning_ui_1", r6)
  mod_plotting_server("plotting_ui_1", r6)

}

shinyApp(ui, server)

```

#### B. Be sure it is tested {.unnumbered}

During the process of building a robust `{shiny}` app, we strongly suggest that you test as many things as you can.
This is where using an R6 for the business logic of your app makes sense: this allows you to build the whole testing of your application data logic outside of any reactive context: you simply build unit tests just as any other function.

For example, let's say we have the following R6 generator:

```{r 15-common-app-caveats-10}
MyData <- R6::R6Class(
  "MyData", 
  # Defining our public methods, that will be 
  # the dataset container, and a summary function
  public = list(
    data = NULL,
    initialize = function(data){
      self$data <- data
    }, 
    summarize = function(){
      summary(self$data)
    }
  )
)
```

We can then build a test for this class using `{testthat}`:

```{r 15-common-app-caveats-11}
library(testthat, warn.conflicts = FALSE)
test_that("R6 Class works", {
  # We define a new instance of this class, that will contain 
  # the mtcars data.frame
  my_data <- MyData$new(mtcars)
  # We will expect my_data to have two classes: 
  # "MyData" and "R6"
  expect_is(my_data, "MyData")
  expect_is(my_data, "R6")
  # And the summarize method to return a table
  expect_is(my_data$summarize(), "table")
  # We would expect the data contained in the object 
  # to match the one taken as input to new()
  expect_equal(my_data$data, mtcars)
  # And the summarize method to be equal to the summary()
  #  on the input object
  expect_equal(my_data$summarize(), summary(mtcars))
})
```

Using R6 allows to rely on these battle-tested tools when it comes to testing functions, something which is made more complex when using other patterns like `reactiveValues()`.

### Logging reactivity with `{whereami}`

Getting a good sense of how reactivity is actually working in your app is not an easy task: the reactivity logic is a graph, and it happens very quickly when you run the app, so it's very hard to follow everything.

`whereami::whereami()`'s [@R-whereami] goal is simple: informing you about where it is called, i.e. from what file and at which line, and how many times.
For example, if you add the following piece of code to your `app_server()`, the location of the function call will be printed to the logs.

```{r 15-common-app-caveats-12, eval = FALSE}
whereami::cat_where( whereami::whereami() )
```

    ── Running server(...) at app_server.R#9 (2) ───────────────

Combining `cat_where()` will implement a reactive logging to your console while developing: that way, you can instantaneously know what reactive contexts are invalidated while using the application.
Of course, you still have to implement it by hand, but that is definitely worth the effort: seeing in real time, in your console, which line is run allows you to detect unexpected behavior.
For example, you will be able to see that the `observeEvent()` from `mod_main.R#79` has been called 17 times when launching the app, which might be an unexpected behavior.

The screenshot in Figure \@ref(fig:15-common-app-caveats-13) shows what a `{whereami}` log might look like, here, for the `{hexmake}` application.

(ref:whereami) `{whereami}` output for `{hexmake}`.

```{r 15-common-app-caveats-13, echo=FALSE, fig.cap="(ref:whereami)", out.width="100%"}
knitr::include_graphics("img/whereami.png")
```

And bonus, once the app is closed, you can get a list of all the "counters" with `whereami::counter_get()`, and how many times they each have been called, and `plot(whereami::counter_get())` will draw a raw plot of the various counters, as shown in Figure \@ref(fig:15-common-app-caveats-14).

(ref:whereamiplot) plot of `{whereami}` counters.

```{r 15-common-app-caveats-14, echo=FALSE, fig.cap="(ref:whereamiplot)", out.width="100%"}
knitr::include_graphics("img/plot_whereami.png")
```

\newpage

## R does too much

### Rendering the UI from the server side

There are many reasons we would want to change things on the UI based on what happens in the server: changing the choices of a `selectInput()` based on the columns of a table which is uploaded by the user, showing and hiding pieces of the app according to an environment variable, allowing the user to create an indeterminate number of inputs, etc.

Chances are that to do that, you have been using the `uiOutput()` and `renderUI()` functions from `{shiny}` [@R-shiny].
Even if convenient, and the functions of choice in some specific context, this pair of functions makes R do a little bit too much: you are making R regenerate the whole UI component instead of changing only what you need, which can be a suboptimal, be it from the user point of view, or from a developer perspective.

**One of the instance in which this pattern might not be optimal is in the case where your visitors do not have an high-speed internet or when visiting and using a smartphone, contexts where every byte counts**.
Rendering large elements from the server side in your `{shiny}` app means that these elements will have to transit through the socket, i.e. they need to be sent by the server, and downloaded by the browser.
In this case, the smaller the message size the better!

From the developer perspective, you will create code that is harder to reason about, as we are used to having the UI parts in the UI functions (but that is not related to performance).

Here are three strategies to code without `uiOutput()` and `renderUI()`.

#### A. Implement UI events in JavaScript {.unnumbered}

> Mixing languages is better than writing everything in one, if and only if using only that one is likely to overcomplicate the program.\
> 
> _The Art of UNIX Programming_ [@ericraymond2003]

We will see in the last chapter of this book how you can integrate JS inside your `{shiny}` app, and how even basic functions can be useful for making your app server smaller.
For example, compare:

```{r 15-common-app-caveats-15, eval = FALSE}
library(shiny)
ui <- function(){
  tagList(
    # Adding a button with an onclick event, 
    # that will show or hide the plot
    actionButton(
      "change", 
      "show/hide graph", 
      # The toggle() function hide or show the queried element
      onclick = "$('#plot').toggle()"
    ), 
    plotOutput("plot")
  )
}

server <- function(
  input, 
  output, 
  session
){
  output$plot <- renderPlot({
    # This renderPlot will only be called once
    cli::cat_rule("Rendering plot")
    plot(iris)
  })
}

shinyApp(ui, server) 
```

to

```{r 15-common-app-caveats-16, eval = FALSE}
library(shiny)
ui <- function(){
  tagList(
    # We use a pattern without JavaScript
    actionButton("change", "show/hide graph"), 
    plotOutput("plot")
  )
}

server <- function(
  input, 
  output, 
  session
){
  
  output$plot <- renderPlot({
    # Here, every time the button is clicked, this reactive 
    # context will be invalidated, and the code re-evaluated
    cli::cat_rule("Rendering plot")
    # Simulate a show and hide pattern
    req(input$change %% 2 == 0)
    plot(iris)
  })
  
}

shinyApp(ui, server)
```

The result is the same, but the first version is shorter and easier to understand: we have one button, and the behavior of the button is self-contained.
The second solution redraws the plot every time the `reactiveValues` is updated, making R compute way more than it should, whereas with the JavaScript-only solution, the plot is not recomputed every time you need to show it: the plot is drawn by R only once.

At a local level, the improvements described in this section will not make your application way faster: for example, rendering UI elements (let's say rendering a simple title) will not be computationally heavy.
But at a global level, less UI computation from the server side helps the general rendering of the app: let's say you have an output that takes 3 seconds to run, then if the whole UI + output is to be rendered on the server side, the whole UI stays blank until everything is computed.

Compare:

```{r 15-common-app-caveats-17, eval = FALSE}
library(shiny)
ui <- function(){
  tagList(
    # We make the whole UI be generated by R
    uiOutput("caption")
  )
}

server <- function(
  input, 
  output, 
  session
){
  output$caption <- renderUI({
    # Simulate something that takes 3 seconds to run
    Sys.sleep(3)
    # Returning the UI
    tagList(
      h3("test"), 
      shinipsum::random_text(10)
    )
    
  })
}

shinyApp(ui, server)
```

to

```{r 15-common-app-caveats-18, eval = FALSE}
library(shiny)
ui <- function(){
  tagList(
    # Only the text input will be rendered by R
    h3("test"), 
    textOutput("caption")
  )
}

server <- function(
  input, 
  output, 
  session
){
  output$caption <- renderText({
    # Here, we only render the text, not the whole UI
    Sys.sleep(3)
    shinipsum::random_text(10)
  })
}

shinyApp(ui, server)
```

In the first example, the UI will wait for the server to have rendered, while in the second we will first see the title, then the rendered text after a few seconds.
That approach makes the user experience better: they know that something is happening, while a completely blank page is confusing.

Also, because R is single threaded, manipulating DOM elements from the server side causes R to be busy doing these DOM manipulations while it could be computing something else.
And let's imagine it takes a quarter of a second to render the DOM element. 
That is a full second for rendering four of them, while R should be busy doing something else!

#### B. `update*` inputs {.unnumbered}

Almost every `{shiny}` input, even the custom ones from packages, come with an `update_` function that allows us to change the input values from the server side, instead of re-creating the UI entirely.
For example, here is a way to update the content of a `selectInput` from the server side:

```{r 15-common-app-caveats-19, eval = FALSE}
library(shiny)
ui <- function(){
  tagList(
    # We start the selectInput empty
    selectInput("species", "Species", choices = NULL), 
    # The selectInput will be populate 
    # when the update button is pressed
    actionButton("update", "Update")
  )
}

server <- function(
  input, 
  output, 
  session
){
  observeEvent( input$update , {
    # Update the selectInput with the species from iris
    spc <- unique(iris$Species)
    updateSelectInput(
      session, 
      "species", 
      choices = spc, 
      selected = spc[1]
    )
  })
  
}

shinyApp(ui, server)
```

This switch to `updateSelectInput` makes the code easier to reason about as the `selectInput` is where it should be: inside the UI, instead of another pattern where we would use `renderUI()` and `uiOutput()`.
Plus, with the `update` method, we are only changing what is needed, not re-generating the whole input.

#### C. `insertUI` and `removeUI` {.unnumbered}

Another way to dynamically change what is in the UI is with `insertUI()` and `removeUI()`.
It is more global than the solution we have seen before with setting the `reactiveValue` to `NULL` or to a value, as it allows us to target a larger UI element: we can insert or remove the whole input, instead of having the DOM element inserted but empty.
This method allows us to have a smaller DOM: `<div>` that are not rendered are not generated empty, they are simply not there.

Two things to note concerning this method, though:

-   Removing an element from the app will not delete the input from the input list. In other words, if you have `selectInput("x", "x")`, and you remove this input using `removeUI()`, you will still have `input$x` in the server.

For example, in the following example, the `input$val` value will not be removed once you have called `removeUI(selector = "#val")`.

```{r 15-common-app-caveats-20, eval = FALSE}
library(shiny)
ui <- function(){
  tagList(
    # Creating a text input that will be removed
    #  from the UI whenever the remove button is pressed
    textInput("value", "Value", "place"),
    actionButton("remove", "Remove UI")
  )
}

server <- function(
  input, 
  output, 
  session
){
  
  observeEvent( input$remove , {
    # When the button is pressed,
    #  the textInput will be removed from the UI
    removeUI(selector = "#value")
  })
  
  observe({
    # We observe input$value every second. 
    # You'll realize that even after the UI
    # is removed, input$value is still available.
    invalidateLater(1000)
    print(input$value)
  })
  
}

shinyApp(ui, server)
```

-   Both these functions take a `jQuery` selector to select the element in the UI. We will introduce these selectors in Chapter \@ref(using-javascript).

### Too much data in memory

If you are building a `{shiny}` application, there is a great chance you are building it to analyze data.
If you are dealing with large datasets, **you should consider deporting the data handling and computation to an external database system: for example, to an SQL database**.
Why?
Because these systems have been created to handle and manipulate data on disk: in other words, it will allow you to perform operations on your data without having to clutter R memory with a large dataset.

For example, if you have a `selectInput()` that is used to perform a filter on a dataset, you can do that filter straight inside SQL, instead of bringing all the data to R and then doing the filter.
That is even more necessary if you are building the app for a large number of users: for example if one `{shiny}` session takes up to 300MB, multiply that by the number of users that will need one session, and you will have a rough estimate of how much RAM you will need.
On the contrary, if you reduce the data manipulation so that it is done by the back-end, you will have, let's say, one database with 300MB of data, so the database size will remain (more or less constant), and the only RAM used by `{shiny}` will be the data manipulation, not the data storage.
That's even more true now that almost any operation you can do today in `{dplyr}` [@R-dplyr] would be doable with an SQL back-end, and that is the purpose of the `{dbplyr}` [@R-dbplyr] package: translates `{dplyr}` code into SQL.

If using a database as a back-end seems a little bit far-fetched right now, that is how it is done in most programming languages: if you are building a web app with NodeJS or Python for example, and need to interact with data, nothing will be stored in RAM: you will be relying on an external database to store your data.
Then your application will be used to make queries to this database back-end.

## Reading data

`{shiny}` applications are a tool of choice when it comes to analyzing data.
But that also means that these data have to be imported/read at some point in time, and reading data can be time consuming.
How can we optimize that?
In this section, we will take a look at three strategies: including datasets inside your application, using R packages for fast data reading, and when and why you should move to an external database system.

### Including data in your application

If you are building your application using the `{golem}` [@R-golem] framework, you are building your application as a package.
R packages provide a way to include internal datasets, which can then be used as objects inside your app.
This is the solution you should go for if your data are never to rarely updated: the datasets are created during package development, then included inside the build of your package.
The plus side of this approach is that it makes the data fast to read, as they are serialized as R native objects.

To include data inside your application, you can use the `usethis::use_data_raw( name = "my_dataset", open = FALSE )` command, which is inside the `02_dev.R` script inside the `dev/` folder of your source application (if you are building the app with `{golem}`).
This will create a folder called `data-raw` at the root of your application folder, with a script to prepare your dataset.
Here, you can read the data, modify it if necessary, and then save it with `usethis::use_data(my_dataset)`.
Once this is done, you will have access to the `my_dataset` object inside your application.

This is, for example, what is done in the `{tidytuesday201942}` [@R-tidytuesday201942] application, in [data-raw/big\_epa\_cars.R](https://github.com/ColinFay/tidytuesday201942/blob/master/data-raw/big_epa_cars.R): the CSV data are read there, and then used as an internal dataset inside the application.

### Reading external datasets

Other applications use data that are not available at build time: they are created to analyze data that are uploaded by users, or maybe they are fetched from an external service while using the app (for example, by calling an API).
When you are building an application for the "user data" use case, the first thing you will need is to provide users a way to upload their dataset: `shiny::fileInput()`.

One crucial thing to keep in mind when it comes to using user-uploaded files is that you have to be (very) strict with the way you handle files:

-   Always specify what type of file you want: `shiny::fileInput()` has an `accept` parameter that allows you to set one or more [MIME types](https://en.wikipedia.org/wiki/Media_type) or extensions. When using this argument (for example, with `text/csv`, `.csv`, or `.xlsx`), the user will only be able to select a subset of files from their computer: the ones that match the type.
-   Always perform checks once the file is uploaded, even more if it is tabular data: column type, naming, empty rows, etc. The more you check the file for potential errors, the less your application is likely to fail to analyze this uploaded dataset.
-   If the data reading takes a while, do not forget to add a visual progression cue: a `shiny::withProgress()` or tools from the [`{waiter}`](https://github.com/JohnCoene/waiter) package.

Whenever you offer a user the possibility to upload anything, you can be sure that at some point, they will upload a file that will make the app crash.
By setting a specific MIME type and by doing a series of checks once the file is uploaded, you will make your application more stable.
Finally, having a visual cue that "something is happening" is very important for the user experience, because "something is happening" is better than not knowing what is happening, and it may also prevent the user from clicking again and again on the upload button, or worse, they will stop using the app.

Now that we have our `fileInput()` set, how do we read these data as fast as possible?
There are several options depending on the type of data you are reading.
Here are some packages that can make the file reading faster:

-   For a tabular, flat dataset (typically csv, tsv, or text), `{vroom}` [@R-vroom] can read data at a 1.40 GB/sec speed. The `fread()` function from `{data.table}` [@R-data.table] is also fast at reading delimited files.
-   For JSON files, `{jsonlite}` [@jsonlite2014]. Or more recently, `{RcppSimdJSON}` [@R-RcppSimdJson], which is a binding to the `simdjson` C++ library.
-   If you need to read Excel files inside your app, `{readxl}` [@R-readxl] offers a binding to the [`RapidXML`](http://rapidxml.sourceforge.net/) C++ library, which reads Excel files fast.
-   Most files exported from statistical software (SAS, SPSS, etc.) can be read using either the `{foreign}` [@R-foreign] or `{haven}` [@R-haven] packages.

### Using external databases

Another type of data analyzed in a shiny application is data that is contained inside an external database.
Databases are heavily used in the data science world and in software engineering as a whole.
Databases come with APIs and drivers that help retrieve and transfer data: be it SQL, NoSQL, or even a graph.

Using a database is one of the solutions for making your app smaller and more efficient in the long run, especially if you need to scale your app to thousands of visitors.
Indeed, **if you plan on having your app scale to numerous people, that will mean that a lot of R processes will be triggered. And if your data is contained in your app, this will mean that each R process will take a significant amount of RAM if the dataset is large**.
For example, if your dataset alone takes \~300 MB of RAM, that means that if you want to launch the app 10 times, you will need \~3GB of RAM.
On the other hand, if you decide to switch these data to an external database, it will lower the global RAM need: the DB will take these 300MB of data, and each shiny application will make a  request to the database.
For instance, if the database needs 300MB, and one shiny app 50MB, then 10 apps will be 300MB (for the DB) + 50MB \* 10 (for the 10 apps).
In practice, other things are to be considered: making database requests can be computationally expensive, and might need some network adjustments, but you get the idea.

How does one choose between database back-end?
Well, first of all you need to see what is available in the environment the application will be deployed: maybe the company you are building the application for already has database servers deployed.
If ever you are free to choose any database as a back-end, your choice should be driven by what kind of operations you want to make on these databases.
**For example, SQL databases are designed to store tabular data, and they tend to be very fast when it comes to reading data: so if you have one or more large data.frames you want to use inside your application, and with no specific update of these data, an SQL back-end can be the perfect choice**.
On the other hand, a NoSQL database like MongoDB will be faster when it comes to doing write operations, and can store any kind of object: for example, `{hexmake}` can use a MongoDB back-end to store RDS files.
But that comes with a price: read calls are a little bit slower, and you might have to work a little bit more on handling the JSON results that come out of MongoDB.
Another example of an app that uses on an external database is `{databasedemo}`, available at [engineering-shiny.org/databasedemo/](https://engineering-shiny.org/databasedemo/).
Feel free to follow this link for more information about this application!

Covering all the available types of databases and the packages associated with each is a very, very large topic: there are dozens of database systems, and as many (if not more) packages to interact with them.
For more extensive coverage of using databases in R, please follow these resources:

-   [Databases using R](https://db.rstudio.com/), the official RStudio documentation around databases and R.

-   [colinfay/r-db](https://colinfay.me/r-db/), a Docker image that bundles the toolchain for a lot of database systems for R.

-   [CRAN Task View: Databases with R](https://cran.r-project.org/web/views/Databases.html): the official task view from CRAN with a series of packages for database manipulation

### Data-source checklist

How to choose between these three methodologies:

```{r 15-common-app-caveats-21, echo= FALSE}
knitr::kable(
  data.frame(
    Choice = c("Package data", "Reading files", "External DataBase"), 
    Update = c("Never to very rare", "Uploaded by Users", "Never to Streaming"), 
    Size = c("Low to medium", "Preferably low", "Low to Big")
  )
)
```