-
Notifications
You must be signed in to change notification settings - Fork 7
/
Copy pathREADME.Rmd
313 lines (272 loc) · 12.9 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
---
title:
output:
html_document:
keep_md: yes
---
```{r setup, include = FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
[![DOI](https://joss.theoj.org/papers/10.21105/joss.04439/status.svg)](https://doi.org/10.21105/joss.04439)
[![Twitter Follow](https://img.shields.io/twitter/follow/musclesyneRgies?style=social)](https://twitter.com/musclesyneRgies)
# musclesyneRgies
![](./images/musclesyneRgies_logo.png)
The package `musclesyneRgies` allows to extract muscle synergies from electromyographic (EMG) data through linear decomposition based on unsupervised machine learning. Specifically, here we adopted the non-negative matrix factorization (NMF) framework, due to the non-negative nature of EMG biosignals. However, this method can be applied to any other kind of data sets, from time series to images.
Muscle synergies are orchestrated activations of functionally-similar muscle groups. The theory stems from original thoughts by the neurophysiologist Nikolai Bernstein and it suggests that the central nervous system might take advantage of such a strategy to simplify the production and control of movement. Instead of commanding each muscle individually for executing a particular task, the central nervous system might send common (but individually weighted) commands to several muscles at the same time. An idea that can be modelled by linear decomposition algorithms such as NMF, the output of which consists of a set of time-independent (also called muscle weights) and a set of time dependent coefficients (also called activation patterns).
If you use this R package, please cite [Santuz, 2022](https://joss.theoj.org/papers/10.21105/joss.04439).
## Installation
- [Download R](https://cran.r-project.org/mirrors.html) and install (please have R >= `4.1.0`)
- [Download RStudio](https://www.rstudio.com/products/rstudio/download/) and install
- Open RStudio and install the package with `install.packages("musclesyneRgies")`.
Done! Now the package is installed on your computer.
## What this package does:
- Help you to prepare the raw data sets in the correct format
- Filter and normalise raw EMG
- Extract muscle synergies
- Classify the extracted muscle synergies
- Analyse muscle synergies with linear and nonlinear metrics
- Plot any data set involved in the process
- All the above tweakable, but with sensible defaults.
## What this package does not do:
- Run the statistics for you
- All that is not specified in the list above.
## A simple workflow
The most simplified workflow for synergy extraction could look as follows (please note that the next chunk of code does not refer to real data and is only intended as a mock example to help you write your own scripts).
```{r eval = FALSE}
# The simplest formulation, using the native (R >= `4.1.0`) pipe operator
# Here, the raw data set is already in the correct format and named `RAW_DATA`
SYNS_classified <- lapply(RAW_DATA, filtEMG) |>
lapply(function(x) normEMG(x, cycle_div = c(100, 100))) |>
lapply(synsNMF) |>
classify_kmeans()
```
Compact, isn't it? You can of course tweak and tune all the above to fit your scientific needs and more is explained below and in the [vignettes](https://github.com/alesantuz/musclesyneRgies/tree/master/vignettes).
To try the above code with real data, it is possible to download the test data set from Zenodo as follows:
```{r eval = FALSE}
# Download test data set containing EMG from human locomotion
url <- "https://zenodo.org/record/6645483/files/RAW_DATA.RData"
download.file(url, destfile = "RAW_DATA.RData", mode = "wb")
# Load test data set
load("RAW_DATA.RData")
# Subset data to allow for correct classification (TW = treadmill walking, TR = treadmill running)
RAW_DATA <- RAW_DATA[grep("_TW_", names(RAW_DATA))]
```
## How to prepare your data set
The data set must be in a specific format to fit the analysis framework. However, if you worked with versions <= `0.8.7-alpha` you will find that requirements are now much less stringent, to the benefit of usability. What you need (see also `?rawdata`) is a list of objects of class `EMG`, each of which is a list with two elements:
- `cycles` data frame containing cycle timings, with as many columns as many cycle subdivisions are wanted
- `emg` data frame containing raw EMG data in columns, first column must be time in the same units as in the cycle timings.
Here is an example of how those two elements should look like:
```{r}
library(musclesyneRgies)
data("RAW_DATA")
head(RAW_DATA[[1]]$cycles)
head(RAW_DATA[[1]]$emg)
```
In this example, cycle times are saved as foot touchdown (first column) and lift-off (second column) times, as the dataset describes locomotion. As you might have noticed, column names do not matter for the `cycles` data frame, but they do for `emg`: this is convenient for the subsequent analysis, since you don't want to loose track of which columns refer to which muscle. Also, the first column must always contain time information, in the same format as in the `cycles` data frame (preferably in seconds).
If you feel like this is too convoluted or you prefer to work directly with ASCII files such as tab-separated txt or comma-separated csv, you can proceed as follows:
- Put your cycle-timings and raw emg ASCII files in two separate folders; please note that the file names **must** be the same (ideally containing the trial codes but this is up to you)
- Run the function `rawdata` which will ask you where your files are and will build the R list in the correct format for you.
Here is an example of how to use the function `rawdata`, no data is needed since the code uses the package's built-in data set to create ASCII files that will then be re-imported through the function:
```{r}
# Load the package
library(musclesyneRgies)
# Load built-in data set
data("RAW_DATA")
# Get current working directory
data_path <- getwd()
data_path <- paste0(data_path, .Platform$file.sep)
# Create two conveniently-named subfolders if they don't already exist
# (if they exist, please make sure they're empty!)
dir.create("cycles", showWarnings = FALSE)
dir.create("emg", showWarnings = FALSE)
# Export ASCII data from built-in data set to the new subfolders
write.table(RAW_DATA[[1]]$cycles,
file = paste0(data_path, "cycles", .Platform$file.sep, names(RAW_DATA)[1], ".txt"),
sep = "\t", row.names = FALSE
)
write.table(RAW_DATA[[1]]$emg,
file = paste0(data_path, "emg", .Platform$file.sep, names(RAW_DATA)[1], ".txt"),
sep = "\t", row.names = FALSE
)
# Run the function to parse ASCII files into objects of class `EMG`
raw_data_from_files <- rawdata(
path_cycles = paste0(data_path, "/cycles/"),
path_emg = paste0(data_path, "/emg/"),
header_cycles = FALSE
)
# Check data in the new folders if needed before running the following (will delete!)
# Delete folders
unlink("cycles", recursive = TRUE)
unlink("emg", recursive = TRUE)
```
## Workflow example
All the code in this section will work as in the example if you copy and paste it in R or RStudio.
```{r message = FALSE, results = "hide", fig.width = 7, fig.asp = 0.9}
# Load the package
library(musclesyneRgies)
# Load the built-in example data set
data("RAW_DATA")
# Say you recorded more cycles than those you want to consider for the analysis
# You can subset the raw data (here we keep only 3 cycles, starting from the first)
RAW_DATA_subset <- lapply(
RAW_DATA,
function(x) {
subsetEMG(x,
cy_max = 3,
cy_start = 1
)
}
)
# Raw EMG can be plotted with the following (the first three seconds are plot by default)
# Now also in dark mode if you fancy it
pp <- plot_rawEMG(RAW_DATA[[1]],
trial = names(RAW_DATA)[1],
row_number = 4,
col_number = 4,
dark_mode = TRUE,
line_col = "tomato3"
)
```
```{r message = FALSE, results = "hide", fig.width = 7, fig.asp = 0.9}
# The raw EMG data set then needs to be filtered
# If you don't want to subset the data set, just filter it as it is
# Here we filter the whole data set with the default parameters for locomotion:
# - Demean EMG
# - High-pass IIR Butterworth 4th order filter (cut-off frequency 50 Hz)
# - Full-wave rectification (default)
# - Low-pass IIR Butterworth 4th order filter (cut-off frequency 20 Hz)
# - Minimum subtraction
# - Amplitude normalisation
filtered_EMG <- lapply(RAW_DATA, function(x) filtEMG(x))
# If you decide to change filtering parameters, just give them as arguments:
another_filtered_EMG <- lapply(
RAW_DATA,
function(x) {
filtEMG(x,
demean = FALSE,
rectif = "halfwave",
HPf = 30,
HPo = 2,
LPf = 10,
LPo = 2,
min_sub = FALSE,
ampl_norm = FALSE
)
}
)
# Now the filtered EMG needs some time normalisation so that cycles will be comparable
# Here we time-normalise the filtered EMG, including only three cycles and trimming first
# and last to remove unwanted filtering effects
# Each cycle is divided into two parts, each normalised to a length of 100 points
norm_EMG <- lapply(
filtered_EMG,
function(x) {
normEMG(x,
trim = TRUE,
cy_max = 3,
cycle_div = c(100, 100)
)
}
)
# If this cycle division does not work for you, it can be changed
# But please remember to have the same amount of columns in the cycle times as the number
# of phases you want your cycles to be divided into
# Here we divide each cycle with a ratio of 60%-40% and keep only two cycles (first and last
# are still trimmed, so to have two cycles you must start with at least four available)
another_norm_EMG <- lapply(
filtered_EMG,
function(x) {
normEMG(x,
trim = TRUE,
cy_max = 2,
cycle_div = c(120, 80)
)
}
)
# The filtered and time-normalised EMG can be plotted with the following
pp <- plot_meanEMG(
norm_EMG[[1]],
trial = names(norm_EMG)[1],
row_number = 4,
col_number = 4,
dark_mode = TRUE,
line_size = 0.8,
line_col = "tomato3"
)
```
```{r message = FALSE, results = "hide", fig.width = 6, fig.asp = 1}
# At this stage, synergies can be extracted
# This is the core function to extract synergies via NMF
SYNS <- lapply(norm_EMG, synsNMF)
# The extracted synergies can be plotted with the following
pp <- plot_syn_trials(
SYNS[[1]],
max_syns = max(unlist(lapply(SYNS, function(x) x$syns))),
trial = names(SYNS)[1],
dark_mode = TRUE,
line_size = 0.8,
line_col = "tomato1",
sd_col = "tomato4"
)
```
```{r message = FALSE, results = "hide", fig.width = 5, fig.asp = 0.7}
# Now synergies don't have a functional order and need classification
# Let's load the built-in data set to have some more trial to classify
# (clustering cannot be done on only one trial and having just a few,
# say less than 10, won't help)
data("SYNS")
# Classify with k-means and produce a plot that shows how the clustering went with:
# - Full width at half maximum on the x-axis
# - Centre of activity on the y-axis
# (both referred to the activation patterns of the classified muscle synergies)
SYNS_classified <- classify_kmeans(SYNS)
```
```{r message = FALSE, results = "hide", fig.width = 6, fig.asp = 1}
# Classified synergies can be finally plotted with
pp <- plot_classified_syns(
SYNS_classified,
dark_mode = TRUE,
line_col = "tomato1",
sd_col = "tomato4",
condition = "TW"
) # "TW" = Treadmill Walking, change with your own
```
```{r message = FALSE, results = "hide", fig.width = 5, fig.asp = 1.4}
# A 2D UMAP plot of the classified synergies can be obtained with
pp <- plot_classified_syns_UMAP(
SYNS_classified,
condition = "TW"
)
```
```{r results = "hide", fig.width = 4, fig.asp = 1}
# From now on, it's all about the analysis
# For example, one can measure the full width at half maximum (FWHM)
# of the activation patterns or their centre of activity (CoA)
# Load a typical activation pattern of 30 cycles (from locomotion)
data("act_pattern")
# Reduce activation pattern to the first cycle
act_sub <- act_pattern$signal[1:which(act_pattern$time == max(act_pattern$time))[1]]
# Calculate FWHM of the first cycle
act_sub_FWHM <- FWHM(act_sub)
# Calculate CoA of the first cycle
act_sub_CoA <- CoA(act_sub)
# Half maximum (for the plots)
hm <- min(act_sub) + (max(act_sub) - min(act_sub)) / 2
hm_plot <- act_sub
hm_plot[which(hm_plot > hm)] <- hm
hm_plot[which(hm_plot < hm)] <- NA
# Plots
plot(act_sub, ty = "l", xlab = "Time", ylab = "Amplitude")
lines(hm_plot, lwd = 3, col = 2) # FWHM (horizontal, in red)
graphics::abline(v = act_sub_CoA, lwd = 3, col = 4) # CoA (vertical, in blue)
# Or perhaps one might want to investigate the nonlinear behaviour of a long activation pattern
act <- act_pattern$signal
# Calculate the local complexity or Higuchi's fractal dimension (HFD)
nonlin_HFD <- HFD(act)$Higuchi
# Calculate the global complexity or Hurst exponent (H)
nonlin_H <- Hurst(act, min_win = max(act_pattern$time))$Hurst
message("Higuchi's fractal dimension: ", round(nonlin_HFD, 3))
message("Hurst exponent: ", round(nonlin_H, 3))
```
## How to contribute to `musclesyneRgies`
Thank you for taking the time to read this. Please refer to the [CONTRIBUTING](https://github.com/alesantuz/musclesyneRgies/blob/master/CONTRIBUTING.md) section for a guide on how to contribute to this package.