-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathtest_specs.qmd
548 lines (381 loc) · 23.4 KB
/
test_specs.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
# Specifications {#sec-tests-specs}
<!---
https://jakubsob.github.io/blog/
https://www.freecodecamp.org/news/clean-coding-for-beginners/
-->
```{r}
#| eval: true
#| echo: false
#| include: false
source("_common.R")
library(testthat)
library(withr)
library(logger)
```
```{r}
#| label: co_box_tldr
#| echo: false
#| results: asis
#| eval: true
co_box(
color = "b",
look = "default", hsize = "1.10", size = "1.05",
header = "TLDR Test Specifications", fold = TRUE,
contents = "
<br>
<h5>**Specifications, Requirements and Features**</h5>
- **User Specifications**: Describe what the end-users expect the application to accomplish\n
- **Features**: User-focused descriptions of the high-level capabilities or characteristics of an application, and often represent a bundle of smaller functionalities\n
- **Functional Requirements**: The precise, measurable, and testable 'functions' an application should perform\n
- **BDD Features and Scenarios:** Descriptions of the expected behavior of the application features using a set of special keywords (`Feature`, `Scenario`, `Given`, `When`, `Then`, `And`, `But`).\n
- `describe()`: *'specifies a larger component or function and contains a set of specifications'*. Include feature descriptions and any relevant background information the `describe()` blocks\n
- `it()`: contains the test code or expectations. These are use to test each functional requirement (or `Then` statements from scenarios).\n
- **Traceability matrix:** A table that ‘traces’ the user specifications to features and functional requirements (and the tests they give rise to) to verify that the application has been developed correctly.\n
"
)
```
---
This chapter focuses on what to test--or specifically, ***how to figure out what to test***. This process usually involves converting a list of user needs into testable requirements. The following chapters in this section have examples of tests, but don't go into depth on *how* to write tests. Excellent resources exist for writing unit tests in R packages [^tests-intro-unit-tests] and testing shiny apps. [^tests-intro-testing] [^tests-intro-testserver] The goal of this chapter is to illustrate the connections between the user's needs, the code below `R/`, and developing tests.
[^tests-intro-unit-tests]: Unit tests are covered extensively in [R Packages, 2ed](https://r-pkgs.org/testing-basics.html) and the [`testthat` documentation](https://testthat.r-lib.org/index.html)
[^tests-intro-testing]: Mastering shiny dedicates an entire [Chapter to Testing](https://mastering-shiny.org/scaling-testing.html).) `shinytest2` also has [excellent documentation](https://rstudio.github.io/shinytest2/) (and [videos](https://www.youtube.com/watch?v=Gucwz865aqQ)), and I highly recommend reading through those resources.
[^tests-intro-testserver]: I *will* cover a few tips and tricks I've learned for testing module server functions with `testServer()` because they're not in the [documentation](https://shiny.posit.co/r/articles/improve/server-function-testing/).
:::: {.callout-tip collapse='true' appearance='default'}
## [Accessing the code examples]{style='font-weight: bold; font-size: 1.15em;'}
::: {style='font-size: 0.95em; color: #282b2d;'}
I've created the [`shinypak` R package](https://mjfrigaard.github.io/shinypak/) In an effort to make each section accessible and easy to follow:
Install `shinypak` using `pak` (or `remotes`):
```{r}
#| code-fold: false
#| message: false
#| warning: false
#| eval: false
# install.packages('pak')
pak::pak('mjfrigaard/shinypak')
```
Review the chapters in each section:
```{r}
#| code-fold: false
#| message: false
#| warning: false
#| collapse: true
library(shinypak)
list_apps(regex = '^11')
```
Launch an app:
```{r}
#| code-fold: false
#| eval: false
launch(app = "11_tests-specs")
```
:::
::::
```{r}
#| eval: false
#| code-fold: false
install.packages("testthat")
library(testthat)
```
(*If you're using `devtools`, you won't have to worry about installing `testthat` and `covr`*)
## Application requirements {#sec-tests-specs-app-reqs}
Information about the various tasks and activities an application is expected to perform in typically stored in some kind of software requirements specification (SRS) document.[^tests-srs] The SRS can include things like distinct design characteristics, budgetary, technological, or timeline restrictions, etc. This document breaks down an application's intended purpose (i.e., the problem it's designed to solve) into three general areas: **user specifications**, **features**, and **functional requirements**.
[^tests-srs]: Read more about what goes in the [Software Requirements Specification](https://en.wikipedia.org/wiki/Software_requirements_specification)
### User Specifications {#sec-tests-specs-user-specs}
```{r}
#| label: git_box_11_tests-specs
#| echo: false
#| results: asis
#| eval: true
git_margin_box(
contents = "launch",
fig_pw = '65%',
branch = "11_tests-specs",
repo = 'sap')
```
The user specifications are the goals and objectives stakeholders want to achieve with the application. They use terms like 'deliver value' and 'provide insight' and provide the basis for deriving the application's features. [^tests-specs-user-specs]
[^tests-specs-user-specs]: User Specifications are sometimes referred to as "user stories," "use cases," or "general requirements"
```{r}
#| label: co_box_user_specs
#| echo: false
#| results: asis
#| eval: false
co_box(
color = "b", fold = TRUE,
look = "default", hsize = "1.10", size = "1.05",
header = "Example Specifications",
contents = "
**S1**: The application should source movie review data from platforms like IMDB or Rotten Tomatoes and should include variables like audience and critic reviews, number of votes, ratings, genres, etc.\n
**S2**: Users should be able to select variables from the movie review data and explore relationships between reviews, ratings, etc., in a visualization.\n
**S3**: The application should have controls for users to customize to colors and visual attributes of the visualization.
**S4**: Finally, users should also have the option to enter text and provide a custom title.\n
")
```
### Features {#sec-tests-specs-features}
Features translate the user expectations into an application capabilities. Generally speaking, features capture the tasks and activities user should "be able to" accomplish with the application (i.e., explore data with a graph).
```{r}
#| label: co_box_feat_reqs
#| echo: false
#| results: asis
#| eval: false
co_box(
color = "b", fold = TRUE,
look = "default", hsize = "1.10", size = "1.05",
header = "Example Features",
contents =
"
**F1.1**: The movie review data should contain continuous variables (e.g., critic and audience ratings) and categorical variables (e.g., mpaa, genres) from IMDB and Rotten Tomatoes.\n
**F1.2**: The UI should display dropdown menus for selecting continuous variables on both the x-axis and y-axis, and a dropdown menu categorical variable for coloration.\n
**F1.3**: Incorporate a sliders or similar controls for adjusting graph aesthetics, such as size, shape, or opacity.\n
**F1.4**: Provide an input text field for users to enter an optional plot title.\n
")
```
### Functional Requirements {#sec-tests-specs-func-reqs}
Functional requirements are written for developers and provide the technical details on each feature. A single feature often gives rise to multiple functional requirements (these are where users needs come into direct contact with code)[^tests-specs-feat]
[^tests-specs-feat]: 'Features' and 'functional requirements' are sometimes used interchangeably, but they refer to different aspects of the application. **Features** are high-level capabilities an application *should* have, and often contain a collection of smaller functionalities (broken down into the specific **functional requirements**).
```{r}
#| label: co_box_fun_reqs
#| echo: false
#| results: asis
#| eval: false
co_box(
color = "b", fold = TRUE,
look = "default", hsize = "1.10", size = "1.05",
header = "Example functional requirements",
contents =
"
**FR 1.1**: The application data should be sourced from IMDB and Rotten Tomatoes (accessed from appropriate APIs) and contain both continuous and categorical variables stored in a tabular format.\n
**FR 1.2**: The application should display dropdown menus listing available continuous variables for x-axis and y-axis selection.\n
**FR 1.3**: Upon user selection of x-axis and y-axis variables, the application should update the graph to reflect the user's choices.\n
**FR 1.4**: The application should present a dropdown menu listing available categorical variables for coloration.\n
**FR 1.5**: The application should dynamically change the color of points in the graph based on the user's categorical variable selection.\n
**FR 1.6**: The app should provide a slider control that allows users to adjust size, ranging from a pre-defined minimum to a maximum size.\n
**FR 1.7**: The app should offer an opacity control, letting users adjust transparency from fully opaque to fully transparent.\n
**FR 1.8**: The app should display an input field where users can type in a custom title for the graph.
**FR 1.9**: Upon user entry or modification in the title input text field, the app should update the graphs's title in real-time.
")
```
#### In summary {.unnumbered}
The areas above help direct the development process, albeit from slightly different perspectives.
1. The user specifications capture the needs and expectations of the end-user.
2. The features describe the high-level capabilities of the application.
3. Functional requirements are the testable, specific actions, inputs, and outputs.
## Application developemnt
The Shiny application development process follows something like the figure below:
![General application development process](images/tests_standard_flow.png){width='80%'}
The figure above is an oversimplification, but it highlights a common separation (or 'hand-off') between users/stakeholders and developers. In the sections below, we'll look at two common development processes: test-driven and behavior-driven development.
### Test-driven development {.unnumbered}
If `sap` was built using test-driven development (TDD), the process might look something like this:
1. Gather user needs and translate into application features:
a. Document the application's capabilities for exploring movie review variables from IMDB and Rotten Tomatoes.
b. Include feature descriptions for displaying continuous variables (i.e., 'critics score' and 'audience score') categorical variables (i.e., 'MPAA' ), graph visual attributes (size, color, opacity), and an optional plot title.
2. Write Tests:
a. Write tests to ensure the graph displays relationships between a set of continuous and categorical variables when the app launches.
3. Run Tests:
a. Before writing any code, these tests will fail.
4. Develop Features:
a. Write UI, server, module, and utility functions for user inputs and graph outputs.
5. Rerun Tests:
a. If the graph has been correctly implemented in the application, the tests should pass.
6. Write more Tests:
a. Add more tests for additional functionalities (e.g., an option to remove missing values from graph).
Starting with tests and writing just enough code to get them to pass often results in developing less (but better) code. The drawback to this approach is a strict focus on the function being tested and not the overall objective of the application.
### Behavior-driven development {.unnumbered}
In behavior-driven development (BDD) (or behavior-driven testing), users and developers work together to understand, define and express application behaviors in non-technical language, [^tests-bdd-readmore]
> "*Using conversation and examples to specify how you expect a system to behave is a core part of BDD*" - [BDD in Action, 2ed](https://www.manning.com/books/bdd-in-action-second-edition)
Placing an emphasis on writing human-readable expectations for the application's behaviors makes it easier to develop tests that can focus on verifying each user need exists (and is functioning properly). In BDD, the application's expected capabilities are captured in `Feature`s and illustrated with concrete examples, or `Scenario`s.
[^tests-bdd-readmore]: Read more about [behavior-driven development](https://en.wikipedia.org/wiki/Behavior-driven_development)
#### [`Feature`]{style="font-size: 0.90em;"} {.unnumbered}
In BDD, a `Feature` describes an *implemented behavior or capability* in the application, from a user's perspective. Typically, these are written in the Gherkin format using specific keywords:[^tests-gherkin]
1. `As a ...`
2. `I want ...`
3. `So that ...`
[^tests-gherkin]: [Gherkin](https://en.wikipedia.org/wiki/Cucumber_(software)#Gherkin_language) is the domain-specific language format used for expressing software behaviors. Tools like [Cucumber](https://cucumber.io/docs/gherkin/reference/) or [SpecFlow](https://specflow.org/learn/gherkin/) maps and executes the Gherkin descriptions against the code to generate a pass/fail report status for each requirement.
Below is an example Gherkin `Feature` for the graph in `launch_app()`:
```{verbatim}
#| eval: false
#| code-fold: false
Feature: Visualization
As a user
I want to see the changes in the plot
So that I can visualize the impact of my customizations
```
As you can see, the feature uses plain language and the wording is user-centric, so it remains accessible to *both* developers and users (or other non-technical stakeholders).
#### [`Scenario`]{style="font-size: 0.90em;"} {.unnumbered}
A Gherkin `Scenario` provides a concrete example of how the `Feature` works and has the following general format:
1. `Given ...`
2. `When ...`
3. `Then ...`
An example `Scenario` for `launch_app()` might be:
```{verbatim}
#| eval: false
#| code-fold: false
Scenario: Viewing the Data Visualization
Given I have launched the application
And it contains movie review data from IMDB and Rotten Tomatoes
And the data contains variables like 'Critics Score' and 'MPAA'
When I interact with the controls in the sidebar panel
Then the graph should update with the selected options
```
#### [`Background`]{style="font-size: 0.90em;"} {.unnumbered}
Instead of repeating any pre-conditions in each `Scenario` (i.e., the steps contained in the "*`Given`*" and first "*`And`*" statement), we can establish the context with a `Background`:
```{verbatim}
#| eval: false
#| code-fold: false
Background: Launching the application
Given I have launched the application
And it loads with movie review data from IMDB and Rotten Tomatoes
Scenario: Viewing the Data Visualization
Given the data contains variables like 'Critics Score' and 'MPAA'
When I interact with the controls in the sidebar panel
Then the graph should update with the selected options
```
Adopting the Gherkin format (or something similar) provides a common language to express an application's behavior:
1. As developers, we can work with users and shareholders to write specifications that describe the expected behavior of each `Feature`
2. When developing tests, we can group the tests by their `Feature` and `Scenario`s
3. Each test can execute a step (i.e., the `Then` statements).
In the next section we'll cover how to map test code for each `Scenario` step with `testthat`.
## BDD and `testthat` {#sec-tests-specs-bdd-testthat}
`testthat`'s BDD functions (`describe()` and `it()`) allow us add Gherkin-style features and scenarios to our test files, ensuring the application remains user-centric while meeting the technical specifications.[^tests-bdd-describe]
### Describe the feature
We can use the language from our `Feature`, `Background`, and `Scenario` to in the `description` in the argument of `describe()`:
```{r}
#| eval: false
#| code-fold: false
testthat::describe(
description = "Feature: Visualization
As a user
I want to see the changes in the plot
So that I can visualize the impact of my customizations",
code = {
})
```
We can also nest `describe()` calls, which means we can include the `Background` (or other relevant information):
```{r}
#| eval: false
#| code-fold: false
describe( # <1>
"Feature: Visualization
As a user
I want to see the changes in the graph
So that I can visualize the impact of my customizations.",
code = { # <1>
describe(# <2>
"Background: Launching the application
Given I have launched the application
And it loads with movie review data from IMDB and Rotten Tomatoes",
code = {
}) # <2>
}) # <1>
```
1. BDD Feature (**title and description**)
2. Background (preexisting conditions **before each scenario**)
### Confirm it with a test
Inside `describe()`, we can include multiple `it()` blocks which "*functions as a test and is evaluated in its own environment.*"
In the example below, we'll use an `it()` block to test the example scenario from above:[^tests-it-blocks]
```{r}
#| eval: false
#| code-fold: false
testthat::describe( # <1>
"Feature: Visualization
As a user
I want to see the changes in the graph
So that I can visualize the impact of my customizations.",
code = { # <1>
testthat::describe(# <2>
"Background: Launching the application
Given I have launched the application
And it loads with movie review data from IMDB and Rotten Tomatoes",
code = { # <2>
testthat::it( # <3>
"Scenario: Viewing the Data Visualization
Given the data contains variables like 'Critics Score' and 'MPAA'
When I interact with the controls in the sidebar panel
Then the graph should update with the selected options", # <3>
code = { # <3>
# test code # <4>
}) # <3>
}) # <2>
}) # <1>
```
1. BDD Feature (**title and description**)
2. Background (preexisting conditions **before each scenario**)
3. Scenario (a **concrete examples that illustrates a feature**)
4. Test code
In the scenario above, `Then` contains the information required for the `testthat` expectation. This could be `expect_snapshot_file()` or `vdiffr::expect_doppelganger()`--whichever makes sense from the user's perspective.
These are generic examples, but hopefully the tests in the upcoming chapters convey how helpful and expressive BDD functions can be (or they inspire you to properly implement what I'm attempting to do in your own app-packages).[^tests-tdd-bdd]
[^tests-tdd-bdd]: For an excellent description on the relationships between behavior-driven development, test-driven development, and domain-driven design, I highly recommend [BDD in Action, 2ed](https://www.manning.com/books/bdd-in-action-second-edition) by John Ferguson Smart and Jan Molack.
[^tests-bdd-describe]: Read more about `describe()` and `it()` in the [`testthat` documentation.](https://testthat.r-lib.org/reference/describe.html) and in the [appendix.](bdd.qmd)
[^tests-it-blocks]: Each [`it()`](https://testthat.r-lib.org/reference/describe.html) block contains the expectations (or what you would traditionally include in `test_that()`).
## Traceability Matrix {#sec-tests-specs-traceability-matrix}
After translating the user needs into functional requirements, we can identify what needs to be tested by building a look-up table (i.e., a matrix).
I like to store early drafts of the requirements and traceability matrix in a vignette:[^tests-vignette-matrix]
```{r}
#| eval: false
#| code-fold: false
usethis::use_vignette("specs")
```
Adding our first vignette to the `vignettes/` folder does the following:
1. Adds the `knitr` and `rmarkdown` packages to the `Suggests` field in `DESCRIPTION`[^test-specs-suggests]
``` sh
Suggests:
knitr,
rmarkdown
```
2. Adds `knitr` to the `VignetteBuilder` field[^test-specs-vignette-builder]
``` sh
VignetteBuilder: knitr
```
3. Adds `inst/doc` to `.gitignore` and `*.html`, `*.R` to `vignettes/.gitignore`
[^test-specs-suggests]: We briefly covered the `Suggests` field in [Dependencies](dependencies.qmd), but in this case it specifically applies to "*packages that are not necessarily needed. This includes packages used only in examples, tests or vignettes...*" - [Writing R Extensions, Package Dependencies](https://cran.r-project.org/doc/manuals/R-exts.html#Package-Dependencies)
[^test-specs-vignette-builder]: The [documentation](https://cran.r-project.org/doc/manuals/R-exts.html#The-DESCRIPTION-file) on `VignetteBuilder` gives a great example of why `knitr` and `rmarkdown` belong in `Suggests` and not `Imports`.
The first column in the traceability matrix contains the user specifications, which we can 'trace' over to the functional requirements and their relevant tests.[^tests-visual-markdown]
[^tests-visual-markdown]: When building tables in vignettes, I highly recommend using the [Visual Markdown mode](https://rstudio.github.io/visual-markdown-editing/).
{{< include _trace_matrix.qmd >}}
Building a traceability matrix ensures:
1. All user specifications have accompanying application features.
2. Each feature has been broken down into precise, measurable, and testable functional requirements.
3. Tests have been written for each functional requirement.
[^tests-vignette-matrix]: Storing the traceability matrix in a vignette is great for developers, but using an issue-tracking system with version control is also a good idea, like GitHub Projects or Azure DevOps.
## Recap {.unnumbered}
```{r}
#| label: git_box_11_tests-specs_launch
#| echo: false
#| results: asis
#| eval: true
git_margin_box(contents = "launch",
fig_pw = '65%',
branch = "11_tests-specs",
repo = 'sap')
```
Understanding the relationship between user specifications, features, and functional requirements gives us the information we need to build applications that satisfy the technical standards while addressing user needs. Documenting requirements in Gherkin-style features and scenarios allows us to capture the application's behavior without giving details on how the functionality is implemented.
In the next chapter, we're going to cover various tools to improve the tests in your app-package. The overarching goal of these tools is to reduce code executed *outside* of your tests (i.e., placed above the call to `test_that()` or `it()`).
```{r}
#| label: co_box_app_recap
#| echo: false
#| results: asis
#| eval: true
co_box(
color = "g",
look = "default", hsize = "1.10", size = "1.05",
header = "Recap: Test Specifications",
contents = "
<br>
**Specifications**
- Scoping tests: user specifications outline software goals and needs, and the functional requirements provide the technical details to achieve them.
- User specifications: descriptions of what a user expects the application to do (i.e., the user 'wish list' of features they want in the application).
- Features: detailed list of the main capabilities and functions the application needs to offer to users.
- Functional requierments: testable, specific step-by-step instructions for ensuring the application does what it's supposed to do.
- Traceability matrix: tracking tool for connecting the users 'wish list' (i.e, specifications) to what's being tested.
",
fold = FALSE
)
```
```{r}
#| label: git_contrib_box
#| echo: false
#| results: asis
#| eval: true
git_contrib_box()
```
<!--
traceability matrix and behavior-driven development (BDD) to ensure the user's specifications are met (and the app's features are implemented correctly).
-->