Computing-on-the-language.rmd

---
title: Non-standard evaluation
layout: default
---

```{r, echo = FALSE}
library(pryr)
```

```{r, echo = FALSE, eval = FALSE}
library(stringr)
special <- c("substitute", "deparse")
funs <- lapply(special, function(x) {
  match <- paste0("^", x, "$")
  c(
    find_funs("package:base", fun_calls, match),
    find_funs("package:utils", fun_calls, match),
    find_funs("package:stats", fun_calls, match)
  )
})
names(funs) <- special
names(Filter(function(x) length(x) == 2, ggplot2:::invert(funs)))
names(Filter(function(x) length(x) == 1, ggplot2:::invert(funs)))
```


# Non-standard evaluation {#nse}

> "Flexibility in syntax, if it does not lead to ambiguity, would seem a
> reasonable thing to ask of an interactive programming language."
>
> --- Kent Pitman, <http://www.nhplace.com/kent/Papers/Special-Forms.html>

R has powerful tools for computing not only on values, but on the actions that lead to those values. These tools are powerful and magical, and one of the most surprising features if you're coming from another programming language. Take the following simple snippet of code that draws a sine curve:

```{r plot-labels}
x <- seq(0, 2 * pi, length = 100)
sinx <- sin(x)
plot(x, sinx, type = "l")
```

Look at the labels on the axes. How did R know that the variable on the x axis was called `x` and the variable on the y axis was called `sinx`? In most programming languages, you can only access values of the function arguments. In R, you can also access the code used to compute them. This makes __non-standard evaluation__ (NSE) possible, and is particularly useful for functions designed to facilitate interactive data analysis because it can dramatically reduce the amount of typing.

The goal of this chapter is to help you understand NSE in existing R code, and to show you how to write your own functions that use. In [Capturing expressions](#capturing-expressions) you'll learn how to capture unevaluated expressions using `substitute()`. In [non-standard evaluation](#subset) you'll learn how `subset()` combines `substitute()` with `eval()` to allow succinctly to select rows from a data frame. [Scoping issues](#scoping-issues) will teach you about the scoping issues that arise in NSE, and show you how to resolve them.

NSE is great for interactive use, but can be hard to program with. [Calling from another function](#calling-from-another-function) shows why every function that uses NSE should have an escape hatch, a version that uses regular evaluation. [Substitute](#substitute) shows you how you can use `substitute()` to modify expressions, which makes it suitable as a general escape hatch.

While powerful, NSE makes code substantially more difficult to reason about. The chapter concludes with a look at the downsides of NSE in [The downsides](#nse-downsides).

### Prereqs

Before reading this chapter, make sure you're familiar with environments ([Environments](#environments)) and lexical scoping ([Lexical scoping](#lexical-scoping)). You'll also need to install the pryr package with `devtools::install_github("hadley/pryr")`. Some exercises require the plyr package, which you can install from CRAN with `install.packages("plyr")`.

## Capturing expressions

`substitute()` is the tool that makes non-standard evaluation possible. It looks at a function argument, and instead of seeing the value, it sees the code used to compute the value:

```{r}
f <- function(x) {
  substitute(x)
}
f(1:10)

x <- 10
f(x)

y <- 13
f(x + y ^ 2)
```

We won't worry about exactly what `substitute()` returns (that's the topic of [the following chapter](#metaprogramming)), but we'll call it an expression.

`substitute()` works because function arguments in R are a special object called a __promise__. A promise captures the expression needed compute the value and the environment in which to compute. You're not normally aware of promises because the first time you access a promise its code is evaluated in its environment, returning a value.

One another function is usally paired with `substitute()`: `deparse()`. It takes the result of `substitute()` (an expression) and turns it to a character vector.

```{r}
g <- function(x) deparse(substitute(x))
g(1:10)
g(x)
g(x + y ^ 2)
```

There are a lot of functions in base R that use these ideas. Some use them to avoid quotes:

```{r, eval = FALSE}
library(ggplot2)
# the same as
library("ggplot2")
```

Other functions, like `plot.default()`, use them to provide default labels:

```{r, eval = FALSE}
plot.default <- function(x, y = NULL, xlabel = NULL, ylabel = NULL, ...) {
    ...
    xlab <- if (is.null(xlabel) && !missing(x)) deparse(substitute(x))
    ylab <- if (is.null(xlabel) && !missing(y)) deparse(substitute(y))
    ...
}
```

(The real code is a little more complicated because `plot()` uses `xy.coords()` to standardise the multiple ways that `x` and `y` can be supplied)

`data.frame()` labels variables with the expression used to compute them:

```{r}
x <- 1:4
y <- letters[1:4]
names(data.frame(x, y))
```

This wouldn't be possible in most programming langauges because functions usually only see values (e.g. `1:4` and `c("a", "b", "c", "d")`), not the expressions that created them (`x` and `y`).

### Exercises

1.  There's one important feature of `deparse()` to be aware of when
    programming with it: can return multiple strings if the input is long.
    For example, calling `g()` as follows will a vector of length two.

    ```{r}
    g(a + b + c + d + e + f + g + h + i + j + k + l + m + n + o + p + q +
      r + s + t + u + v + w + x + y + z)
    ```

    Why does this happen? Carefully read the documentation. Can you write a
    wrapper around `deparse()` that always returns a single string?

2.  Why does `as.Date.default()` use `substitute()` and `deparse()`?
    Why does `pairwise.t.test()` use them? Read the source code.

3.  `pairwise.t.test()` is written under the assumption that `deparse()`
    always returns a length one character vector. Can you construct an
    input that violates this expectation? What happens?

4.  `f()`, defined above, just calls `substitute()`. Why can't we use it
    to define `g()`? In other words, what will the following code return?
    First make a prediction, then run the code and think about the results.

    ```{r, eval = FALSE}
    f <- function(x) substitute(x)
    g <- function(x) deparse(f(x))
    g(1:10)
    g(x)
    g(x + y ^ 2 / z + exp(a * sin(b)))
    ```

5.  The pattern `deparse(substitute(x))` is very common in base R code.
    Why can't you write a function that does both things in one step?

## Non-standard evaluation in subset {#subset}

Just printing out the expression used to generate an argument value is useful, but we can do more with the unevaluated code. For example, take `subset()`. It's a useful interactive shortcut for subsetting data frames: instead of repeating the name of data frame you're working with again and again, you can save some typing:

```{r}
sample_df <- data.frame(a = 1:5, b = 5:1, c = c(5, 3, 1, 4, 1))

subset(sample_df, a >= 4)
# equivalent to:
# sample_df[sample_df$a >= 4, ]

subset(sample_df, b == c)
# equivalent to:
# sample_df[sample_df$b == sample_df$c, ]
```

Subset is special because the expressions `a >= 4` or `b == c` aren't evaluated in the global environment: instead they're evaluated in the data frame. In other words, `subset()` implements different scoping rules so instead of looking for those variables in the current environment, `subset()` looks in the specified data frame. This is the essence of non-standard evaluation.

How does `subset()` work?  We've already seen how to capture the expression that computes an argument, rather than its result, so we just need to figure out how to evaluate that expression in the right context, so that `x` is interpreted as `sample_df$x`, not `globalenv()$x`. To do this we need `eval()`, which takes an expression and evaluates it in the specified environment.

Before we can explore `eval()` we need one more useful function: `quote()`. It captures an unevaluated expression like `substitute()`, but you don't need to use it inside a function. This makes it useful for interactive experimentation.

```{r}
quote(1:10)
quote(x)
quote(x + y ^ 2)
```

We need `quote()` to experiment with `eval()` because the first argument to `eval()` is an expression. If you only provide one argument, it evaluates the expression in the current environment. This makes `eval(quote(x))` exactly equivalent to typing `x`, regardless of what `x` is:

```{r, error = TRUE}
eval(quote(x <- 1))
eval(quote(x))

eval(quote(y))
```

Note that `quote()` and `eval()` are effectively opposites. In the example below, each `eval()` peels off one layer of quoting.

```{r}
quote(2 + 2)
eval(quote(2 + 2))

quote(quote(2 + 2))
eval(quote(quote(2 + 2)))
eval(eval(quote(quote(2 + 2))))
```

The second argument to `eval()` controls the environment in which the code is executed:

```{r}
x <- 10
eval(quote(x))

e <- new.env()
e$x <- 20
eval(quote(x), e)
```

Instead of an environment, the second argument can also be a list or a data frame.  This works because lists and data frames bind names to values in a similar way to environments.

```{r}
eval(quote(x), list(x = 30))
eval(quote(x), data.frame(x = 40))
```

This gives us one part of `subset()`:

```{r}
eval(quote(a >= 4), sample_df)
eval(quote(b == c), sample_df)
```

A common mistake when first starting to use `eval()` is to forget to quote the first argument. Compare the results in the following example:

```{r, error = TRUE}
a <- 10
eval(quote(a), sample_df)
eval(a, sample_df)

eval(quote(b), sample_df)
eval(b, sample_df)
```

We can use `eval()` and `substitute()` to write `subset()`. First we capture the call representing the condition, then evaluate it in the context of the data frame and use the result for subsetting:

```{r}
subset2 <- function(x, condition) {
  condition_call <- substitute(condition)
  r <- eval(condition_call, x)
  x[r, ]
}
subset2(sample_df, a >= 4)
```

### Exercises

1.  Implement your own version of `quote()` using `substitute()`.

2.  What will this code return?

    ```{r, eval = FALSE}
    eval(quote(eval(quote(eval(quote(2 + 2))))))
    eval(eval(quote(eval(quote(eval(quote(2 + 2)))))))
    quote(eval(quote(eval(quote(eval(quote(2 + 2)))))))
    ```

3.  `subset2()` has a bug if you use it with a single column data frame.
    What should the following code return? How can you modify `subset2()`
    so it returns the correct type of object?

    ```{r}
    sample_df2 <- data.frame(x = 1:10)
    subset2(sample_df2, x > 8)
    ```

4.  What happens if you use `quote()` instead of `substitute()` inside of
    `subset2()`?

4.  The real subset function (`subset.data.frame()`) removes missing
    values in the condition. Modify `subset2()` to also drop these rows.

5.  The real subset function also performs variable selection. It allows you
    to work with variable names like they are positions, so you can do things
    like `subset(mtcars, , -cyl)` to drop the cylinder variable, or
    `subset(mtcars, , disp:drat)` to select all the variables between `disp`
    and `drat`.  How does it work? I've made it easier to
    understand by extracting it out into its own function.

    ```{r, eval = FALSE}
    select <- function(df, vars) {
      vars <- substitute(vars)
      var_pos <- setNames(as.list(seq_along(df)), names(df))
      pos <- eval(vars, var_pos)
      df[, pos, drop = FALSE]
    }
    select(mtcars, -cyl)
    ```

6.  What does `evalq()` do? Use it to reduce the amount of typing for the
    examples above that use both `eval()` and `quote()`

## Scoping issues

It certainly looks like our `subset2()` function works. But since we're working with expressions instead of values, we need to test a little more carefully. For example, you might expect that the following uses of `subset2()` should all return the same value because each variable refers to the same value:

```{r, error = TRUE}
y <- 4
x <- 4
condition <- 4
condition_call <- 4

subset2(sample_df, a == 4)
subset2(sample_df, a == y)
subset2(sample_df, a == x)
subset2(sample_df, a == condition)
subset2(sample_df, a == condition_call)
```

What's going wrong? You can get a hint from the variable names I've chosen: they are all variables defined inside `subset2()`. If `eval()` can't find the variable inside the data frame (its second argument), it looks in the environment of `subset2()`. That's obviously not what we want, so we need some way to tell `eval()` to look somewhere else if it can't find the variables in the data frame.

The key is the third argument to `eval()`: `enclos`. This allows us to specify a parent (or enclosing) environment for objects that don't have one (like lists and data frames). If the binding is not found in `env`, `eval()` will next look in `enclos`, and the parents of `enclos`. `enclos` is ignored if `env` is a real environment. We want to look for `x` in the environment from which `subset2()` was called. In R terminology this is called the __parent frame__ and is accessed with `parent.frame()`. This is an example of [dynamic scope](http://en.wikipedia.org/wiki/Scope_%28programming%29#Dynamic_scoping) because the values come from the location where the function was called, not where it was defined.

With this modification our function works:

```{r}
subset2 <- function(x, condition) {
  condition_call <- substitute(condition)
  r <- eval(condition_call, x, parent.frame())
  x[r, ]
}

x <- 4
subset2(sample_df, a == x)
```

Using `enclos` is just a shortcut for converting a list or data frame to an environment. We can get the same behaviour by using `list2env()` to turn a list into an environment with an explicit parent:

```{r}
subset2a <- function(x, condition) {
  condition_call <- substitute(condition)
  env <- list2env(x, parent = parent.frame())
  r <- eval(condition_call, env)
  x[r, ]
}

x <- 5
subset2a(sample_df, a == x)
```

When using NSE it's also a good idea to test that your code works outside of the global environment:

```{r}
f <- function() {
  x <- 5
  subset2a(sample_df, a == x)
}
f()
```

### Exercises

1.  What does `transform()` do? Read the documentation. How does it work?
    Read the source code for `transform.data.frame()`. What does
    `substitute(list(...))` do? Create a function that does only that
    and experiment with it.

2.  `plyr::arrange()` works similarly to `subset()`, but instead of selecting
    rows, it reorders them. How does it work?  What does
    `substitute(order(...))` do?

3.  `plyr::mutate()` is similar to `transform()` but it applies the
    transformations sequentially so that transformation can refer to columns
    that were just created:

    ```{r, eval = FALSE}
    df <- data.frame(x = 1:5)
    transform(df, x2 = x * x, x3 = x2 * x)
    plyr::mutate(df, x2 = x * x, x3 = x2 * x)
    ```

    How does mutate work? What's the key difference between `mutate()` and
    `transform()`?

4.  What does `with()` do? How does it work? Read the source code for
    `with.default()`.

5.  What does `within()` do? How does it work? Read the source code for
    `within.data.frame()`. Why is the code so much more complex than
    `with()`?

## Calling from another function

Typically, computing on the language is most useful for functions called directly by the user, not by other functions. For example `subset()` saves typing but it's difficult to use non-interactively, from another function. For example, imagine we want a function that randomly reorders a subset of the data. A nice way to write that function would be to compose a function for random reordering and a function for subsetting. Let's try that:

```{r}
subset2 <- function(x, condition) {
  condition_call <- substitute(condition)
  r <- eval(condition_call, x, parent.frame())
  x[r, ]
}

scramble <- function(x) x[sample(nrow(x)), ]

subscramble <- function(x, condition) {
  scramble(subset2(x, condition))
}
```

But it doesn't work:

```{r, error = TRUE}
subscramble(sample_df, a >= 4)
# Error in eval(expr, envir, enclos) : object 'a' not found
traceback()
#> 5: eval(expr, envir, enclos)
#> 4: eval(condition_call, x, parent.frame()) at #3
#> 3: subset2(x, condition) at #1
#> 2: scramble(subset2(x, condition)) at #2
#> 1: subscramble(sample_df, a >= 4)
```

What's gone wrong? To figure it out, lets `debug()` subset and work through the code line-by-line:

```{r, eval = FALSE}
debugonce(subset2)
subscramble(sample_df, a >= 4)
#> debugging in: subset2(x, condition)
#> debug at #1: {
#>     condition_call <- substitute(condition)
#>     r <- eval(condition_call, x, parent.frame())
#>     x[r, ]
#> }
n
#> debug at #2: condition_call <- substitute(condition)
n
#> debug at #3: r <- eval(condition_call, x, parent.frame())
r <- eval(condition_call, x, parent.frame())
#> Error in eval(expr, envir, enclos) : object 'a' not found
condition_call
#> condition
eval(condition_call, x)
#> Error in eval(expr, envir, enclos) : object 'a' not found
Q
```

Can you see what the problem is? `condition_call` contains the expression `condition` so when we try to evaluate that it evaluates `condition` which has the value `a >= 4`. This can't be computed in the parent environment because it doesn't contain an object called `a`. If `a` is set in the global environment, far more confusing things can happen:

```{r}
a <- 4
subscramble(sample_df, a == 4)

a <- c(1, 1, 4, 4, 4, 4)
subscramble(sample_df, a >= 4)
```

This is an example of the general tension between functions that are designed for interactive use and functions that are safe to program with. A function that uses `substitute()` might save typing, but it's difficult to call from another function. As a developer you should always provide an escape hatch: an alternative version that uses standard evaluation. In this case, we could write a version of `subset2()` that takes a quoted expression:

```{r}
subset2_q <- function(x, condition) {
  r <- eval(condition, x, parent.frame())
  x[r, ]
}
```

I usually suffix these functions with `q` to indicate that they take a quoted call.  Most users won't need them so the name can be a little longer. We can then rewrite both `subset2()` and `subscramble()` to use `subset2_q()`:

```{r}
subset2 <- function(x, condition) {
  subset2_q(x, substitute(condition))
}

subscramble <- function(x, condition) {
  condition <- substitute(condition)
  scramble(subset2_q(x, condition))
}

subscramble(sample_df, a >= 3)
subscramble(sample_df, a >= 3)
```

Base R functions tend to use a different sort of escape hatch. They often have an argument that turns off NSE. For example, `require()` has `character.only = TRUE`. I don't think using an argument to change the behaviour of another argument is a good idea because it means you must completely and carefully read all of the function arguments to understand what one function argument means. Since you can't understand the effect of each argument in isolation, it's harder to predict what the function will do.

### Exercises

1.  The following function attempts to figure out if the input is already
    a quoted expression using `is.call()`. Why wont't it work?

    ```{r}
    is.call(123)
    is.call(quote(a == b))

    subset3 <- function(x, condition) {
      if (!is.call(condition)) {
        condition <- substitute(condition)
      }
      r <- eval(condition, x)
      x[r, ]
    }
    ```

2.  The following R functions all use non-standard evaluation. For each,
    describe how it uses non-standard evaluation. Read the documentation
    to determine the escape hatch: how do you force the function to use
    standard evaluation rules?
    * `rm()`
    * `library()` and `require()`
    * `substitute()`
    * `data()`
    * `data.frame()`
    * `ls()`

3.  Add an escape hatch to `plyr::mutate()` by splitting it into two functions.
    One function should capture the unevaluated inputs, and the other should
    take a data frame and list of expressions and perform the computation.

4.  What's the escape hatch for `ggplot::aes()`? What about `plyr::.()`?
    What do they have in common? What are the advantages and disadvantages
    of their differences?

5.  The version of `subset2_q()` I presented is actually somewhat simplified.
    Why is the following version better?

    ```{r}
    subset2_q <- function(x, condition, env = parent.frame()) {
      r <- eval(condition, x, env)
      x[r, ]
    }
    ```

    Rewrite `subset2()` and `subscramble()` to use this improved version.

## Substitute

Most functions that use non-standard evaluation provide an escape hatch. But what happens if you want to call a function without one? For example, imagine you want to create a lattice graphic given the names of two variables:

```{r, error = TRUE}
library(lattice)
xyplot(mpg ~ disp, data = mtcars)

x <- quote(mpg)
y <- quote(disp)
xyplot(x ~ y, data = mtcars)
```

We can turn to `substitute()` and use it for another purpose: to modify an expression. Unfortunately `substitute()` has a feature that makes modifying calls interactively a bit of a pain: it never does substitutions when run from the global environment, and just behaves like `quote()`:

```{r, eval = FALSE}
a <- 1
b <- 2
substitute(a + b + z)
#> a + b + z
```

However, if you run it inside a function, `substitute()` substitutes what it can and leaves everything else as is:

```{r}
f <- function() {
  a <- 1
  b <- 2
  substitute(a + b + z)
}
f()
```

To make it easier to experiment with `substitute()`, `pryr` provides the `subs()` function.  It works exactly the same way as `substitute()` except it has a shorter name and it works in the global environment. Together, this makes it much easier to experiment with substitution:

```{r}
a <- 1
b <- 2
subs(a + b + z)
```

The second argument (to both `subs()` and `substitute()`) can override the use of the current environment, and provide an alternative list of name-value pairs to use. The following example uses that technique to show some variations on substituting a string, variable name or function call:

```{r}
subs(a + b, list(a = "y"))
subs(a + b, list(a = quote(y)))
subs(a + b, list(a = quote(y())))
```

Remember that every action in R is a function call, so we can also replace `+` with another function:

```{r}
subs(a + b, list("+" = quote(f)))
subs(a + b, list("+" = quote(`*`)))
```

It's quite possible to make nonsense commands with `substitute()`:

```{r}
subs(y <- y + 1, list(y = 1))
```

You can also use `substitute()` to insert arbitrary objects into an expression, but this is a bad idea. In the example below, the expression doesn't print correctly, but it returns the correct result when we evaluate it:

```{r}
df <- data.frame(x = 1)
x <- subs(class(df))
x
eval(x)
```

Formally, substitution takes place by examining each object name in the expression. If the name is:

* an ordinary variable, it's replaced by the value of the variable.

* a promise (a function argument), it's replaced by the expression associated
  with the promise.

* `...`, it's replaced by the contents of `...`

Otherwise it's left as is.

We can use this to create the right call to `xyplot()`:

```{r}
x <- quote(mpg)
y <- quote(disp)
subs(xyplot(x ~ y, data = mtcars))
```

It's even simpler inside a function, because we don't need to explicitly quote the x and y variables. Following the rules above, `substitute()` replaces named arguments with their expressions, not their values:

```{r}
xyplot2 <- function(x, y, data = data) {
  substitute(xyplot(x ~ y, data = data))
}
xyplot2(mpg, disp, data = mtcars)
```

If we include `...` in the call to substitute, we can add additional arguments to the call:

```{r}
xyplot3 <- function(x, y, ...) {
  substitute(xyplot(x ~ y, ...))
}
xyplot3(mpg, disp, data = mtcars, col = "red", aspect = "xy")
```

### Non-standard evaluation in substitute

`substitute()` is itself a function that uses non-standard evaluation, but doesn't have an escape hatch. For example, we can't use `substitute()` if we already have an expression saved in a variable:

```{r}
x <- quote(a + b)
substitute(x, list(a = 1, b = 2))
```

Although `substitute()` doesn't have a built-in escape hatch, so we can use `substitute()` itself to create one:

```{r}
substitute2 <- function(x, env) {
  call <- substitute(substitute(y, env), list(y = x))
  eval(call)
}

x <- quote(a + b)
substitute2(x, list(a = 1, b = 2))
```

The implementation of `substitute2` is short, but deep. Let's work through the example above: `substitute2(x, list(a = 1, b = 2))`.  It's a little tricky because of `substitute()`'s non-standard evaluation rules, we can't use the usual technique of working through the parentheses inside-out.

1.  First `substitute(substitute(y, env), list(y = x))` is evaluated.
    The expression `substitute(y, env)` is captured and `y` is replaced by the
    value of `x`. Because we've put `x` inside a list, it will be evaluated and
    the rules of substitute will replace `y` with it's value. This yields the
    expression `substitute(a + b, env)`

2.  Next we evaluate that expression inside the current function.
    `substitute()` specially evaluates its first argument, and looks for name
    value pairs in `env`, which evaluates to `list(a = 1, b = 2)`. Those are
    both values (not promises) so the result will be `1 + 2`

### Capturing unevaluated ... {#capturing-dots}

Another useful technique is to capture all of the unevaluated expressions in `...`.  Base R functions do this in many ways, but there's one technique that works well in a wide variety of situations:

```{r}
dots <- function(...) {
  eval(substitute(alist(...)))
}
```

This uses the `alist()` function which simply captures all its arguments. This function is the same as `pryr::dots()`. Pryr also provides `pryr::named_dots()`, which ensures all arguments are named, using deparsed expressions as default names, just like `data.frame()`.

### Exercises

1.  Use `subs()` convert the LHS to the RHS for each of the following pairs:
    * `a + b + c` -> `a * b * c`
    * `f(g(a, b), c)` -> `(a + b) * c`
    * `f(a < b, c, d)` -> `if (a < b) c else d`

2.  For each of the following pairs of expressions, describe why you can't
    use `subs()` to convert between them.
    * `a + b + c` -> `a + b * c`
    * `f(a, b)` -> `f(a, b, c)`
    * `f(a, b, c)` -> `f(a, b)`

3.  How does `pryr::named_dots()` work? Read the source.

## The downsides of non-standard evaluation {#nse-downsides}

A big downside of non-standard evaluation is that it is not [referentially transparent](http://en.wikipedia.org/wiki/Referential_transparency_(computer_science)). A function is __referentially transparent__ if you can replace its arguments with their values and behaviour doesn't change. For example, if a function `f()` referentially transparent, and both `x` and `y` are 10, then both `f(x)` and `f(y)` evaluate to the same result, which will be same as `f(10)`. Referentially transparent code is easier to reason about because names of objects don't matter, and you can always work from the most inner parenthesese outwards.

There are many important functions that by their very nature are not referentially transparent. Take the assignment operator. You can't take `a <- 1` and replace `a` by its value and get the same behaviour. This is one reason that people usually write assignments at the top-level of functions. It's hard to reason about code like this:

```{r}
a <- 1
b <- 2
if ((a <- a + 1) > (b <- b - 1)) {
  b <- b + 2
}
```

Using NSE automatically prevents a function from being referentially transparent. This makes the mental model needed to correctly predict the output much more complicated, so it's only worthwhile to use NSE if there is significant gain. For example, `library()` and `require()` allow you to call them either with or without quotes, because internally they use `deparse(substitute(x))` plus a couple of other tricks. That means that these two lines do exactly the same thing:

```{r, eval = FALSE}
library(ggplot2)
library("ggplot2")
```

However, things start to get complicated if the variable is associated with a value. What package will this load?

```{r, eval = FALSE}
ggplot2 <- "plyr"
library(ggplot2)
```

There are a number of other R functions that work in this way, like `ls()`, `rm()`, `data()`, `demo()`, `example()` and `vignette()`. To me, eliminating two keystrokes is not worth the loss of referential transparency, and I don't recommend you use NSE for this purpose.

One situtation where non-standard evaluation is more useful is `data.frame()`. It uses the input to automatically name the output variables if not explicitly supplied:

```{r}
x <- 10
y <- "a"
df <- data.frame(x, y)
names(df)
```

I think it is worthwhile in `data.frame()` because it eliminates a lot of redundancy in the common scenario when you're creating a data frame from existing variables, and importantly, it's easy to override this behaviour by supplying names for each variable.

Non-standard evaluation allows you to write functions that are extremely powerful, but the lack of referential transparency makes it harder to model the behaviour of a function, and makes it harder to program with. As well as always providing an escape hatch that gets back to standard evaluation, carefully consider both the benefits and costs of NSE before using it in a new domain.

### Exercises

1.  What does the following function do? What's the escape hatch?
    Do you think that this an appropriate use of NSE?

    ```{r}
    nl <- function(...) {
      dots <- named_dots(...)
      lapply(dots, eval, parent.frame())
    }
    ```

2.  Instead of relying on promises, you can use formulas created with `~`
    to explicitly capture an expression and its environment. What are the
    advantages and disadvantages of making quoting explicit? How does it
    impact referential transparency?

3.  Read the [standard non-standard evaluation rules]
    (http://developer.r-project.org/nonstandard-eval.pdf).