test_specs.qmd

# Specifications {#sec-tests-specs}

<!---
https://jakubsob.github.io/blog/

https://www.freecodecamp.org/news/clean-coding-for-beginners/
-->

```{r}
#| eval: true 
#| echo: false 
#| include: false
source("_common.R")
library(testthat)
library(withr)
library(logger)
```


```{r}
#| label: co_box_tldr
#| echo: false
#| results: asis
#| eval: true
co_box(
  color = "b",
  look = "default", hsize = "1.10", size = "1.05",
  header = "TLDR Test Specifications", fold = TRUE,
  contents = "

<br>
  
<h5>**Specifications, Requirements and Features**</h5>
  
- **User Specifications**: Describe what the end-users expect the application to accomplish\n
- **Features**: User-focused descriptions of the high-level capabilities or characteristics of an application, and often represent a bundle of smaller functionalities\n
- **Functional Requirements**: The precise, measurable, and testable 'functions' an application should perform\n
- **BDD Features and Scenarios:** Descriptions of the expected behavior of the application features using a set of special keywords (`Feature`, `Scenario`, `Given`, `When`, `Then`, `And`, `But`).\n
    - `describe()`: *'specifies a larger component or function and contains a set of specifications'*. Include feature descriptions and any relevant background information the `describe()` blocks\n
    - `it()`: contains the test code or expectations. These are use to test each functional requirement (or `Then` statements from scenarios).\n

- **Traceability matrix:** A table that ‘traces’ the user specifications to features and functional requirements (and the tests they give rise to) to verify that the application has been developed correctly.\n
  
  "
)
```

---

This chapter focuses on what to test--or specifically, ***how to figure out what to test***. This process usually involves converting a list of user needs into testable requirements. The following chapters in this section have examples of tests, but don't go into depth on *how* to write tests. Excellent resources exist for writing unit tests in R packages [^tests-intro-unit-tests] and testing shiny apps. [^tests-intro-testing] [^tests-intro-testserver] The goal of this chapter is to illustrate the connections between the user's needs, the code below `R/`, and developing tests.


[^tests-intro-unit-tests]: Unit tests are covered extensively in [R Packages, 2ed](https://r-pkgs.org/testing-basics.html) and the [`testthat` documentation](https://testthat.r-lib.org/index.html)

[^tests-intro-testing]: Mastering shiny dedicates an entire [Chapter to Testing](https://mastering-shiny.org/scaling-testing.html).) `shinytest2` also has [excellent documentation](https://rstudio.github.io/shinytest2/) (and [videos](https://www.youtube.com/watch?v=Gucwz865aqQ)), and I highly recommend reading through those resources.

[^tests-intro-testserver]: I *will* cover a few tips and tricks I've learned for testing module server functions with `testServer()` because they're not in the  [documentation](https://shiny.posit.co/r/articles/improve/server-function-testing/).

:::: {.callout-tip collapse='true' appearance='default'}

## [Accessing the code examples]{style='font-weight: bold; font-size: 1.15em;'}

::: {style='font-size: 0.95em; color: #282b2d;'}

I've created the [`shinypak` R package](https://mjfrigaard.github.io/shinypak/) In an effort to make each section accessible and easy to follow:
  
Install `shinypak` using `pak` (or `remotes`):

```{r}
#| code-fold: false 
#| message: false
#| warning: false
#| eval: false
# install.packages('pak')
pak::pak('mjfrigaard/shinypak')
```

Review the chapters in each section:
  
```{r}
#| code-fold: false 
#| message: false
#| warning: false
#| collapse: true
library(shinypak)
list_apps(regex = '^11')
```

Launch an app: 

```{r}
#| code-fold: false 
#| eval: false
launch(app = "11_tests-specs")
```

::: 

::::


```{r}
#| eval: false
#| code-fold: false
install.packages("testthat")
library(testthat)
```

(*If you're using `devtools`, you won't have to worry about installing `testthat` and `covr`*)


## Application requirements {#sec-tests-specs-app-reqs}

Information about the various tasks and activities an application is expected to perform in typically stored in some kind of software requirements specification (SRS) document.[^tests-srs] The SRS can include things like distinct design characteristics, budgetary, technological, or timeline restrictions, etc. This document breaks down an application's intended purpose (i.e., the problem it's designed to solve) into three general areas: **user specifications**, **features**, and **functional requirements**. 

[^tests-srs]: Read more about what goes in the [Software Requirements Specification](https://en.wikipedia.org/wiki/Software_requirements_specification)

### User Specifications {#sec-tests-specs-user-specs}

```{r}
#| label: git_box_11_tests-specs
#| echo: false
#| results: asis
#| eval: true
git_margin_box(
  contents = "launch",
  fig_pw = '65%', 
  branch = "11_tests-specs", 
  repo = 'sap')
```

The user specifications are the goals and objectives stakeholders want to achieve with the application. They use terms like 'deliver value' and 'provide insight' and provide the basis for deriving the application's features. [^tests-specs-user-specs]

[^tests-specs-user-specs]: User Specifications are sometimes referred to as "user stories," "use cases," or "general requirements"

```{r}
#| label: co_box_user_specs
#| echo: false
#| results: asis
#| eval: false
co_box(
  color = "b", fold = TRUE, 
  look = "default", hsize = "1.10", size = "1.05",
  header = "Example Specifications",
  contents = "
**S1**: The application should source movie review data from platforms like IMDB or Rotten Tomatoes and should include variables like audience and critic reviews, number of votes, ratings, genres, etc.\n
  
**S2**: Users should be able to select variables from the movie review data and explore relationships between reviews, ratings, etc., in a visualization.\n
  
**S3**: The application should have controls for users to customize to colors and visual attributes of the visualization. 
  
**S4**: Finally, users should also have the option to enter text and provide a custom title.\n
  
  ")
```

### Features {#sec-tests-specs-features}

Features translate the user expectations into an application capabilities. Generally speaking, features capture the tasks and activities user should "be able to" accomplish with the application (i.e., explore data with a graph).

```{r}
#| label: co_box_feat_reqs
#| echo: false
#| results: asis
#| eval: false
co_box(
  color = "b", fold = TRUE,
  look = "default", hsize = "1.10", size = "1.05",
  header = "Example Features",
  contents = 
"
**F1.1**: The movie review data should contain continuous variables (e.g., critic and audience ratings) and categorical variables (e.g., mpaa, genres) from IMDB and Rotten Tomatoes.\n
  
**F1.2**: The UI should display dropdown menus for selecting continuous variables on both the x-axis and y-axis, and a dropdown menu categorical variable for coloration.\n
  
**F1.3**: Incorporate a sliders or similar controls for adjusting graph aesthetics, such as size, shape, or opacity.\n
  
**F1.4**: Provide an input text field for users to enter an optional plot title.\n
  ")
```

### Functional Requirements {#sec-tests-specs-func-reqs}

Functional requirements are written for developers and provide the technical details on each feature. A single feature often gives rise to multiple functional requirements (these are where users needs come into direct contact with code)[^tests-specs-feat]

[^tests-specs-feat]: 'Features' and 'functional requirements' are sometimes used interchangeably, but they refer to different aspects of the application. **Features** are high-level capabilities an application *should* have, and often contain a collection of smaller functionalities (broken down into the specific **functional requirements**).

```{r}
#| label: co_box_fun_reqs
#| echo: false
#| results: asis
#| eval: false
co_box(
  color = "b", fold = TRUE, 
  look = "default", hsize = "1.10", size = "1.05",
  header = "Example functional requirements",
  contents = 
" 
**FR 1.1**: The application data should be sourced from IMDB and Rotten Tomatoes (accessed from appropriate APIs) and contain both continuous and categorical variables stored in a tabular format.\n
  
**FR 1.2**: The application should display dropdown menus listing available continuous variables for x-axis and y-axis selection.\n
  
**FR 1.3**: Upon user selection of x-axis and y-axis variables, the application should update the graph to reflect the user's choices.\n
  
**FR 1.4**: The application should present a dropdown menu listing available categorical variables for coloration.\n
  
**FR 1.5**: The application should dynamically change the color of points in the graph based on the user's categorical variable selection.\n
  
**FR 1.6**: The app should provide a slider control that allows users to adjust size, ranging from a pre-defined minimum to a maximum size.\n
  
**FR 1.7**: The app should offer an opacity control, letting users adjust transparency from fully opaque to fully transparent.\n
  
**FR 1.8**: The app should display an input field where users can type in a custom title for the graph.
  
**FR 1.9**: Upon user entry or modification in the title input text field, the app should update the graphs's title in real-time.
  
  ")
```

#### In summary {.unnumbered}

The areas above help direct the development process, albeit from slightly different perspectives.

1. The user specifications capture the needs and expectations of the end-user. 

2. The features describe the high-level capabilities of the application.

3. Functional requirements are the testable, specific actions, inputs, and outputs.

## Application developemnt

The Shiny application development process follows something like the figure below:

![General application development process](images/tests_standard_flow.png){width='80%'}

The figure above is an oversimplification, but it highlights a common separation (or 'hand-off') between users/stakeholders and developers. In the sections below, we'll look at two common development processes: test-driven and behavior-driven development.

### Test-driven development {.unnumbered}

If `sap` was built using test-driven development (TDD), the process might look something like this:

1.  Gather user needs and translate into application features:
    a. Document the application's capabilities for exploring movie review variables from IMDB and Rotten Tomatoes. 
    b. Include feature descriptions for displaying continuous variables (i.e., 'critics score' and 'audience score') categorical variables (i.e., 'MPAA' ), graph visual attributes (size, color, opacity), and an optional plot title. 

2. Write Tests: 
    a. Write tests to ensure the graph displays relationships between a set of continuous and categorical variables when the app launches. 

3. Run Tests: 
    a. Before writing any code, these tests will fail. 
    
4. Develop Features: 
    a. Write UI, server, module, and utility functions for user inputs and graph outputs. 
    
5. Rerun Tests: 
    a. If the graph has been correctly implemented in the application, the tests should pass.
    
6. Write more Tests:  
    a. Add more tests for additional functionalities (e.g., an option to remove missing values from graph). 
    
Starting with tests and writing just enough code to get them to pass often results in developing less (but better) code. The drawback to this approach is a strict focus on the function being tested and not the overall objective of the application.

### Behavior-driven development {.unnumbered}

In behavior-driven development (BDD) (or behavior-driven testing), users and developers work together to understand, define and express application behaviors in non-technical language, [^tests-bdd-readmore]

> "*Using conversation and examples to specify how you expect a system to behave is a core part of BDD*" - [BDD in Action, 2ed](https://www.manning.com/books/bdd-in-action-second-edition)

Placing an emphasis on writing human-readable expectations for the application's behaviors makes it easier to develop tests that can focus on verifying each user need exists (and is functioning properly). In BDD, the application's expected capabilities are captured in `Feature`s and illustrated with concrete examples, or `Scenario`s.

[^tests-bdd-readmore]: Read more about [behavior-driven development](https://en.wikipedia.org/wiki/Behavior-driven_development)

#### [`Feature`]{style="font-size: 0.90em;"} {.unnumbered}

In BDD, a `Feature` describes an *implemented behavior or capability* in the application, from a user's perspective. Typically, these are written in the Gherkin format using specific keywords:[^tests-gherkin]

1. `As a ...`   
2. `I want ...`    
3. `So that ...`   

[^tests-gherkin]: [Gherkin](https://en.wikipedia.org/wiki/Cucumber_(software)#Gherkin_language) is the domain-specific language format used for expressing software behaviors. Tools like [Cucumber](https://cucumber.io/docs/gherkin/reference/) or [SpecFlow](https://specflow.org/learn/gherkin/) maps and executes the Gherkin descriptions against the code to generate a pass/fail report status for each requirement. 

Below is an example Gherkin `Feature` for the graph in `launch_app()`:

```{verbatim}
#| eval: false 
#| code-fold: false
Feature: Visualization
    As a user
    I want to see the changes in the plot
    So that I can visualize the impact of my customizations
```

As you can see, the feature uses plain language and the wording is user-centric, so it remains accessible to *both* developers and users (or other non-technical stakeholders).

#### [`Scenario`]{style="font-size: 0.90em;"} {.unnumbered}

A Gherkin `Scenario` provides a concrete example of how the `Feature` works and has the following general format: 

1. `Given ...`   
2. `When ...`    
3. `Then ...`  

An example `Scenario` for `launch_app()` might be:

```{verbatim}
#| eval: false 
#| code-fold: false
  Scenario: Viewing the Data Visualization
    Given I have launched the application
    And it contains movie review data from IMDB and Rotten Tomatoes
    And the data contains variables like 'Critics Score' and 'MPAA'
    When I interact with the controls in the sidebar panel
    Then the graph should update with the selected options
```

#### [`Background`]{style="font-size: 0.90em;"} {.unnumbered}

Instead of repeating any pre-conditions in each `Scenario` (i.e., the steps contained in the "*`Given`*" and first "*`And`*" statement), we can establish the context with a `Background`:

```{verbatim}
#| eval: false 
#| code-fold: false
  Background: Launching the application
    Given I have launched the application
    And it loads with movie review data from IMDB and Rotten Tomatoes
    
  Scenario: Viewing the Data Visualization
    Given the data contains variables like 'Critics Score' and 'MPAA'
    When I interact with the controls in the sidebar panel
    Then the graph should update with the selected options
```

Adopting the Gherkin format (or something similar) provides a common language to express an application's behavior: 

1. As developers, we can work with users and shareholders to write specifications that describe the expected behavior of each `Feature` 

2. When developing tests, we can group the tests by their `Feature` and `Scenario`s

3. Each test can execute a step (i.e., the `Then` statements).

In the next section we'll cover how to map test code for each `Scenario` step with `testthat`.

## BDD and `testthat` {#sec-tests-specs-bdd-testthat}

`testthat`'s BDD functions (`describe()` and `it()`) allow us add Gherkin-style features and scenarios to our test files, ensuring the application remains user-centric while meeting the technical specifications.[^tests-bdd-describe]

### Describe the feature 

We can use the language from our `Feature`, `Background`, and `Scenario` to in the `description` in the argument of `describe()`:

```{r}
#| eval: false 
#| code-fold: false
testthat::describe(
  description = "Feature: Visualization
                   As a user
                   I want to see the changes in the plot
                   So that I can visualize the impact of my customizations",
  code = {
  
})
```

We can also nest `describe()` calls, which means we can include the `Background` (or other relevant information): 

```{r}
#| eval: false 
#| code-fold: false
describe( # <1>
  "Feature: Visualization
      As a user
      I want to see the changes in the graph
      So that I can visualize the impact of my customizations.", 
  code = { # <1>
    
  describe(# <2>
    "Background: Launching the application
        Given I have launched the application
        And it loads with movie review data from IMDB and Rotten Tomatoes", 
    code = {   
               
      })       # <2>
    
  }) # <1>
```
1. BDD Feature (**title and description**)    
2. Background (preexisting conditions **before each scenario**)  

### Confirm it with a test

Inside `describe()`, we can include multiple `it()` blocks which "*functions as a test and is evaluated in its own environment.*" 

In the example below, we'll use an `it()` block to test the example scenario from above:[^tests-it-blocks]

```{r}
#| eval: false 
#| code-fold: false
testthat::describe( # <1>
  "Feature: Visualization
      As a user
      I want to see the changes in the graph
      So that I can visualize the impact of my customizations.", 
  code = { # <1>
  
  testthat::describe(# <2>
    "Background: Launching the application
        Given I have launched the application
        And it loads with movie review data from IMDB and Rotten Tomatoes",
      code = { # <2>
      
    testthat::it( # <3>
      "Scenario: Viewing the Data Visualization
         Given the data contains variables like 'Critics Score' and 'MPAA'
         When I interact with the controls in the sidebar panel
         Then the graph should update with the selected options", # <3>
        code = { # <3>
          # test code # <4>
        }) # <3>
      
    }) # <2>
  
}) # <1>
```
1. BDD Feature (**title and description**)    
2. Background (preexisting conditions **before each scenario**) 
3. Scenario (a **concrete examples that illustrates a feature**)
4. Test code  

In the scenario above, `Then` contains the information required for the `testthat` expectation. This could be `expect_snapshot_file()` or `vdiffr::expect_doppelganger()`--whichever makes sense from the user's perspective.

These are generic examples, but hopefully the tests in the upcoming chapters convey how helpful and expressive BDD functions can be (or they inspire you to properly implement what I'm attempting to do in your own app-packages).[^tests-tdd-bdd]

[^tests-tdd-bdd]: For an excellent description on the relationships between behavior-driven development, test-driven development, and domain-driven design, I highly recommend [BDD in Action, 2ed](https://www.manning.com/books/bdd-in-action-second-edition) by John Ferguson Smart and Jan Molack.

[^tests-bdd-describe]: Read more about `describe()` and `it()` in the [`testthat` documentation.](https://testthat.r-lib.org/reference/describe.html) and in the [appendix.](bdd.qmd)

[^tests-it-blocks]: Each [`it()`](https://testthat.r-lib.org/reference/describe.html) block contains the expectations (or what you would traditionally include in `test_that()`).

## Traceability Matrix {#sec-tests-specs-traceability-matrix}

After translating the user needs into functional requirements, we can identify what needs to be tested by building a look-up table (i.e., a matrix).

I like to store early drafts of the requirements and traceability matrix in a vignette:[^tests-vignette-matrix]

```{r}
#| eval: false 
#| code-fold: false
usethis::use_vignette("specs")
```

Adding our first vignette to the `vignettes/` folder does the following:

1.   Adds the `knitr` and `rmarkdown` packages to the `Suggests` field in `DESCRIPTION`[^test-specs-suggests]

``` sh
Suggests: 
    knitr,
    rmarkdown
```

2.   Adds `knitr` to the `VignetteBuilder` field[^test-specs-vignette-builder]

``` sh
VignetteBuilder: knitr
```

3.   Adds `inst/doc` to `.gitignore` and `*.html`, `*.R` to `vignettes/.gitignore`

[^test-specs-suggests]: We briefly covered the `Suggests` field in [Dependencies](dependencies.qmd), but in this case it specifically applies to "*packages that are not necessarily needed. This includes packages used only in examples, tests or vignettes...*" - [Writing R Extensions, Package Dependencies](https://cran.r-project.org/doc/manuals/R-exts.html#Package-Dependencies)

[^test-specs-vignette-builder]: The [documentation](https://cran.r-project.org/doc/manuals/R-exts.html#The-DESCRIPTION-file) on `VignetteBuilder` gives a great example of why `knitr` and `rmarkdown` belong in `Suggests` and not `Imports`.

The first column in the traceability matrix contains the user specifications, which we can 'trace' over to the functional requirements and their relevant  tests.[^tests-visual-markdown]

[^tests-visual-markdown]: When building tables in vignettes, I highly recommend using the [Visual Markdown mode](https://rstudio.github.io/visual-markdown-editing/).

{{< include _trace_matrix.qmd >}}

Building a traceability matrix ensures:  

1. All user specifications have accompanying application features. 

2. Each feature has been broken down into precise, measurable, and testable functional requirements.

3. Tests have been written for each functional requirement.

[^tests-vignette-matrix]: Storing the traceability matrix in a vignette is great for developers, but using an issue-tracking system with version control is also a good idea, like GitHub Projects or Azure DevOps.

## Recap {.unnumbered}

```{r}
#| label: git_box_11_tests-specs_launch
#| echo: false
#| results: asis
#| eval: true
git_margin_box(contents = "launch",
  fig_pw = '65%', 
  branch = "11_tests-specs", 
  repo = 'sap')
```

Understanding the relationship between user specifications, features, and functional requirements gives us the information we need to build applications that satisfy the technical standards while addressing user needs. Documenting requirements in Gherkin-style features and scenarios allows us to capture the application's behavior without giving details on how the functionality is implemented.

In the next chapter, we're going to cover various tools to improve the tests in your app-package. The overarching goal of these tools is to reduce code executed *outside* of your tests (i.e., placed above the call to `test_that()` or `it()`).

```{r}
#| label: co_box_app_recap
#| echo: false
#| results: asis
#| eval: true
co_box(
  color = "g",
  look = "default", hsize = "1.10", size = "1.05",
  header = "Recap: Test Specifications",
  contents = "
  
<br>
  
**Specifications**
  
  - Scoping tests: user specifications outline software goals and needs, and the functional requirements provide the technical details to achieve them. 
  
    - User specifications: descriptions of what a user expects the application to do (i.e., the user 'wish list' of features they want in the application).
  
    - Features: detailed list of the main capabilities and functions the application needs to offer to users.
  
    - Functional requierments: testable, specific step-by-step instructions for ensuring the application does what it's supposed to do.
  
    - Traceability matrix: tracking tool for connecting the users 'wish list' (i.e, specifications) to what's being tested.
  
  ", 
  fold = FALSE
)
```


```{r}
#| label: git_contrib_box
#| echo: false
#| results: asis
#| eval: true
git_contrib_box()
```

<!--
traceability matrix and behavior-driven development (BDD) to ensure the user's specifications are met (and the app's features are implemented correctly).
-->