-
Notifications
You must be signed in to change notification settings - Fork 21
/
Copy pathREADME.Rmd
252 lines (163 loc) · 9.99 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
---
output:
md_document:
variant: gfm
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-"
)
```
# Synopsis
Shinyngs is an R package designed to facilitate downstream analysis of RNA-seq and similar expression data with various exploratory plots and data mining tools. It is unrelated to the recently published [Shiny Transcritome Analysis Resource Tool](https://github.com/jminnier/STARTapp) (START), though it was probably developed at the same time as that work.
# Examples
## Data structure
A companion R package, [zhangneurons](https://github.com/pinin4fjords/zhangneurons), contains an example dataset to illustrate the features of Shinyngs, as well as the code required to produce it.
## Running application
A Shinyngs example is running at https://pinin4fjords.shinyapps.io/shinyngs_example/ and contains a subset of the example data (due to limited resources on shinyapps.io).
# Rationale
Shinyngs differs to START and other similar applications (see also [Degust](http://www.vicbioinformatics.com/degust/)), in that no effort is made to provide analysis capabilities. The envisaged process is:
* RNA-seq data is analysed, producing a set of matrices, and p/q values generated for a given set of comparisons.
* Matrix and comparison data is loaded into the modified [SummarizedExperiment](http://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html) structure provided by Shinyngs, and serialised. This is easily automated.
* Serialised object used as input to autmoatically produce the Shiny app using Shinyngs.
There are a great many experimental designs and analysis methods, and in building Shinyngs I've taken the view that analysis is best left to the analyst. The envisaged use case is that of a bioinformatician attempting to convey results of analysis to non-experts.
ShinyNGS provides a number of capabilities you may not find in other applications:
* Simple selection of gene sets by name/ annotation to modify the plots and tables shown.
* Progressive filters for differential analysis: "Show me all genes differential in these contrasts but NOT in these other contrasts"
* Large variety of visualisations: row-wise clustering, UpSet-style intersection plots, gene set enrichment barcode plots etc.
# Screenshot
![Example: the gene page](screenshots/gene_page.png)
## Objectives
* Allow rapid exploration of data output more or less straight from RNA-seq piplelines etc.
* Where more parameters are provided, extend the exploratory tools available - e.g. for differential expression.
## Features
* A variety of single and multiple-panel Shiny applications- currently heatmap, pca, boxplot, dendrogram, gene-wise barplot, various tables and an RNA-seq app combining all of these.
* Leveraging of libraries such as [DataTables](https://rstudio.github.io/DT/) and [Plotly](https://plot.ly/) for rich interactivity.
* Takes input in an extension of the commonly used `SummarizedExperiment` format, called `ExploratorySummarizedExperiment`
* Interface kept simple where possible, with complexity automatically added where required:
* Input field clutter reduced with the use of collapses from [shinyBS](https://ebailey78.github.io/shinyBS/index.html) (when installed).
* If a list of `ExploratorySummarizedExperiment`s is supplied (useful in situiations where the features are different beween matrices - e.g. from transcript- and gene- level analyses), a selection field will be provided.
* If a selected experiment contains more than one assay, a selector will again be provided.
* For me: leveraging of [Shiny modules](http://shiny.rstudio.com/articles/modules.html). This makes re-using complex UI components much easier, and maintaining application code is orders of magnitude simpler as a result.
# Modularisation
Shinyngs is built on Shiny 'modules'- most of which are in single files in the package code. As a consequence code is highly re-usable. Documentation forthcoming, but take a look at how the `selectmatrix` module is called by the PCA plots, boxplots etc.
# Installation
## Prerequisites
`shinyngs` relies heavily on `SummarizedExperiment`. Formerly found in the `GenomicRanges` package, it now has its own package on Bioconductor: http://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html. This requires a recent version of R.
Graphical enhancements are provided by `shinyBS` and `shinyjs`
### Browser
**Strong recommendation for Chrome over Firefox** - everything renders much more nicely.
## Conda
shinyngs is available as a Conda packge in Bioconda, as always it's recommended to use a clean environment. With the Bioconda channel [appropriately configured](https://bioconda.github.io/#usage) you can just do:
```{r, engine = 'bash', eval = FALSE}
conda create -n shinyngs r-shinyngs
conda activate shinyngs
```
(though I always recommend the `mamba` command in place of `conda`).
### Note on M1 Macs
At the time of writing the dependency tree for `arm64` was a bit problematic. So just make and use Conda envs specifiying intel architecture:
```{r, engine = 'bash', eval = FALSE}
CONDA_SUBDIR=osx-64 conda create -n shinyngs r-shinyngs
conda activate shinyngs
conda config --env --set subdir osx-64
```
## Docker container
Through the magic of the Bioconda and Biocontainers teams there is also a [Docker image](https://quay.io/repository/biocontainers/r-shinyngs) available.
## Install with devtools
```{r eval=FALSE}
devtools::install_github('pinin4fjords/shinyngs', upgrade_dependencies = FALSE)
```
# Example
An example `ExploratorySummarizedExperimentList` based on the Zhang et al study of neurons and glia (http://www.jneurosci.org/content/34/36/11929.long) is available in a separate package, and this can be used to demonstrate available features.
Install the package like:
```{r, eval=FALSE}
library(devtools)
install_github('pinin4fjords/zhangneurons')
```
... and load and use the data like:
```{r eval=FALSE}
library(shinyngs)
library(zhangneurons)
data("zhangneurons")
app <- prepareApp("rnaseq", zhangneurons)
shiny::shinyApp(app$ui, app$server)
```
The function `eselistFromYAML()` is provided to help build your own objects given a config file.
# New: command-line interfaces
## App creation
A new feature (may be buggy) is the creation of Shiny apps from file complements:
```
make_app_from_files.R \
--assay_files raw.tsv,normalised_counts.tsv \
--sample_metadata samplesheet.csv \
--feature_metadata gene_meta.tsv \
--contrast_file contrasts.csv \
--differential_results treatment-saline-drug.deseq2.results.tsv \
--output_dir app \
--contrast_stats_assay 2 \
--unlog_foldchanges
```
(This script can be found under `exec`).
This is designed to take a regular file complement of
- Expression matrices
- Metadata (samples and features)
- Contrasts (which sample groups to compare)
- Differential resutls (e.g. from DESeq2) containing P values and fold changes
.. and produce an app.R. This currently covers the basic use cases and I haven't go to the gene sets etc, that will be future work.
You can start the resulting app locally, by running the `app.R` resulting from the above command.
See `make_app_from_files.R --help` for more info.
### shinyapps.io deployment
The following specified to `make_app_from_files.R` in addition to the above will trigger a deployment to shinyapps.io where the app can be viewed:
```
--deploy_app \
--shinyapps_account ACCOUNT \
--shinyapps_name APP_NAME
```
You must derive your token and secret from your shinyapps.io account and set them in the environment variables `SHINYAPPS_TOKEN` and `SHINYAPPS_SECRET`, respectively.
This is currently dependent on shinyngs having been installed via devtools, which doesn't happen in the Conda install, but I'm trying to address that.
## Static plot generation
I've found it useful to reuse some of the plotting components in shinyngs to produce non-Shiny plot outputs for use in static reporting.
### Exploratory analysis
A generic complement of explortory plots can be generated like:
```
exploratory_plots.R \
--assay_files salmon.merged.gene_counts.tsv,normalised_counts.tsv,variance_stabilised_counts.tsv \
--assay_names raw,normalised,variance_stabilised \
--sample_metadata samplesheet.csv \
--contrast_variable treatment \
--outdir plots \
--feature_metadata gene_meta.tsv
```
See `exploratory_plots.R --help` for more info.
### Differential analysis
Differential analysis plots, currently just volcano plots, can be generated with `differential_plots.R`. See `exploratory_plots.R --help` for more info.
### Validation
shinyngs has some good validation when building objects, to make sure that matrices are consistent with sample and feature annotations, and that the specified contrasts make sense. Accessing that logic by itself can be useful when writing FOM (feature/ observation matrix) workflows, so that is available separately like:
```
validate_fom_components.R \
--sample_metadata=testdata/samplesheet.csv \
--assay_files=testdata/SRP254919.salmon.merged.gene_counts.top1000cov.tsv \
--contrasts_file testdata/contrasts.csv \
--output_directory output
```
If `--output_directory` is specified, results are re-written (in a consistent format, TSV by default) the specified location.
This script will error if there are inconsistencies between sample sheets, feature sets, matrices, and contrast specifications.
# Documentation
Technical information can be accessed via the package documentation:
```{r eval = FALSE}
?shinyngs
```
More user-oriented documentation and examples of how to build your own apps in the [vignette](https://rawgit.com/pinin4fjords/shinyngs/master/vignettes/shinyngs.html).
This is also accessible via the `vignette` command:
```{r eval = FALSE}
vignette('shinyngs')
```
# TODO
* More useful non-RNAseq functionality to be added
# Contributors
I can be reached on @pinin4fjords with any queries. Other contributors welcome.
# License
[GNU Affero General Public License v3.0](LICENSE.txt)