Skip to content

Commit

Permalink
changed page style and added TIDE and CellFie pages
Browse files Browse the repository at this point in the history
xavierbemo committed Dec 4, 2024

Verified

This commit was signed with the committer’s verified signature.
snyk-bot Snyk bot
1 parent 34b3f18 commit fb16836
Showing 10 changed files with 255 additions and 111 deletions.
42 changes: 34 additions & 8 deletions _config.yml
Original file line number Diff line number Diff line change
@@ -1,9 +1,35 @@
title: MTEAPY's homepage
description: Python library for Metabolic Task Enrichment Analysis
theme: jekyll-theme-cayman
# remote_theme: zendesk/jekyll-theme-zendesk-garden@main
show_downloads: true
title: MTEApy
description: A Python library for Metabolic Task Enrichment Analysis
theme: just-the-docs
repository: bsc-life/mteapy

github:
zip_url: https://github.com/bsc-life/mteapy/zipball/main
tar_url: https://github.com/bsc-life/mteapy/tarball/main
aux_links_new_tab: true
aux_links:
View MTEApy on GitHub:
- https://github.com/bsc-life/mteapy
nav_links_new_tab: true
nav_external_links:
- title: Download .tar.gz
url: https://github.com/bsc-life/mteapy/tarball/main
- title: View MTEApy on GitHub
url: https://github.com/bsc-life/mteapy

footer_content: true
color_scheme: light

callouts_level: quiet # or loud
callouts:
highlight:
color: yellow
important:
title: Important
color: blue
new:
title: New
color: green
note:
title: Note
color: purple
warning:
title: Warning
color: red
3 changes: 3 additions & 0 deletions _includes/footer_custom.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
<footer class="site-footer">
<span class="site-footer-owner"></span>
</footer>
1 change: 1 addition & 0 deletions _includes/head_custom.html
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
<link rel="stylesheet" href="{{ '/assets/css/custom_style.css' | relative_url }}">
Empty file added _includes/header_custom.html
Empty file.
6 changes: 6 additions & 0 deletions _includes/nav_footer_custom.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
<footer class="site-footer">
{% if site.github.is_project_page %}
<span class="site-footer-owner">{{ site.description }}<p></p></span>
<span class="site-footer-owner"><a href="{{ site.github.repository_url }}">{{ site.github.repository_name }}</a> is maintained by <a href="{{ site.github.owner_url }}">{{ site.github.owner_name }}</a>.</span>
{% endif %}
</footer>
42 changes: 0 additions & 42 deletions _layouts/default.html

This file was deleted.

22 changes: 3 additions & 19 deletions assets/css/custom_style.css
Original file line number Diff line number Diff line change
@@ -1,19 +1,3 @@
body { padding: 0; margin: 0; font-family: "Ubuntu"; font-size: 16px; line-height: 1.5; color: #606c71; }


.page-header { color: #fff; text-align: center; background-color:orange; background-image: linear-gradient(120deg, orange, purple); }
@media screen and (min-width: 64em) { .page-header { padding: 5rem 6rem; } }
@media screen and (min-width: 42em) and (max-width: 64em) { .page-header { padding: 3rem 4rem; } }
@media screen and (max-width: 42em) { .page-header { padding: 2rem 1rem; } }


.project-name { margin-top: 0.1rem; margin-bottom: 0.1rem; }
@media screen and (min-width: 64em) { .project-name { font-size: 3.25rem; } }
@media screen and (min-width: 42em) and (max-width: 64em) { .project-name { font-size: 2.25rem; } }
@media screen and (max-width: 42em) { .project-name { font-size: 1.75rem; } }


.main-content h1, .main-content h2, .main-content h3, .main-content h4, .main-content h5, .main-content h6 { margin-top: 2rem; margin-bottom: 1rem; font-weight: bold; color: orange; }


.main-content code { padding: 2px 4px; font-family: Consolas, "Liberation Mono", Menlo, Courier, monospace; font-size: 0.9rem; color: #567482; background-color: #f3f6fa; border-radius: 0.5rem; }
a { color: rgb(255, 102, 0); text-decoration: none; }
h1, h2, h3, h4, h5, h6, #toctitle { margin-top: 0; margin-bottom: 1em; font-weight: 500; line-height: 1.25; color: rgb(255, 102, 0); }

95 changes: 95 additions & 0 deletions cellfie.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
---
title: CellFie
layout: default
nav_order: 3
---

# **The CellFie framework**
{: .no_toc }
***

## Table of contents
{: .no_toc .text-delta }
1. TOC
{:toc}

## Description

The **CellFie** framework is a contraint-based metabolic modeling framework that was originally published by [Richelle _et al._, 2021](https://doi.org/10.1016/j.crmeth.2021.100040). It leverages the use of mathematical descriptions of metabolic functions (metabolic tasks) and transcriptomics data to quantify metabolic functions. As opposed to TIDE, the CellFie framework allows for the processing of multiple samples at a time, making it suitable for large datasets and single-cell RNA sequencing.

## Command options

| Argument | Shortcut | Description | Default |
|:-------- |:-------- |:----------- |:------- |
| `expr_file` | | Filename for a normalized gene expression file (e.g., TPM). It should contain at least one column with gene names/symbols. | |
| `--delim` | `-d` | Field delimiter for inputed file. | `\t` |
| `-out` | `-o` | Directory to store the analysis' results. The result file(s) will be stored in the specified directory in a tab-sepparated format (`.tsv`). | `CellFie_results/` |
| `--gene_col` | | Name of the column in the inputed file containing gene names/symbols. | `geneID` |
| `--threshold_type` | | Determines the threshold approach to be used. A `global` approach used the same threshold for all genes whereas a `local` approach uses a different threshold for each gene when computing the gene activity levels. | `local` |
| `--global_threshold_type` | | Whether to use a `value` or a `percentile` of the distribution of all genes as global treshold for all genes. | `percentile` |
| `--global_value` | | Value to use as global threshold according to the `global_threshold_type` option selected. Note that percentile values must be between 0 and 1. | `0.75` |
| `--local_threshold_type` | | Determines the threshold type to be used in a local approach. `minmaxmean`: the threshold for each gene is determined by the mean of expression values across all conditions/samples but must be higher or equal than a lower bound and lower or equal to an upper bound. `mean`: the threshold of a gene is determined as its mean expression across all conditions/samples. | `minmaxmean` |
| `--minmaxmean_threshold_type` | | Whether to use `value` or `percentile` of the distribution of all genes as upper and lower bounds. | `percentile` |
| `--upper_bound` | | Upper bound value to be used according to the `minmaxmean_threshold_type`. Note that percentile values must be between 0 and 1. | `0.75` |
| `--lower_bound` | | Lower bound value to be used according to the `minmaxmean_threshold_type`. Note that percentile values must be between 0 and 1. | `0.25` |
| `--binary_scores` | | Flag to indicate whether to also return the binary metabolic score matrix as a second result file. See the original publication for more details | `False` |

## Usage Example

### Transcriptomics Data

One of the first things that the CellFie framework requires is a normalized gene expression matrix (usually stored as TPMs). Normaly, this type of data contains gene names/symbols as rows, and samples as columns. For the command to run, one of the columns of the matrix must store the information regarding gene names/symbols.

A typical normalized gene expression matrix will look like the following:

```
geneID S1 S2 S3 S4
0 ENSG00000000419 6.721972 7.768211 0.111999 0.561086
1 ENSG00000001036 5.880123 10.804611 4.273897 3.703098
2 ENSG00000001084 13.568022 11.912389 21.792070 4.126645
3 ENSG00000001630 9.830659 10.973878 16.052115 3.264040
4 ENSG00000002549 10.312642 10.373970 6.246490 0.597024
... ... ... ... ... ...
```

### Running CellFie

To run the CellFie framework using the command-line, the command `run-mtea CellFie` should be used with the desired arguments. A typical CellFie analysis is run using the `minmaxmean` local thresholding strategy, which will be used by default by the command, with a percentile upper and lower bounds of `0.75` and `0.25`.

{: .note}
Only the **Human-GEM** and its metabolic tasks are implemented, so the framework will only take in **EnsemblIDs** as valid genic nomenclature. We are working to allow for any metabolic model and metabolic tasks to be used for more customisable analyses!

```sh
run-mtea CellFie expression_file.tsv \
-o results/ \
--gene_col geneID \
--threshold_type local \
--local_threshold_type minmaxmean \
--minmaxmean_threshold_type percentile \
--upper_bound 0.75 \
--lower_bound 0.25 \
--binary_scores
```

### Understanding the CellFie results

Once the analysis is run, one or two results files will be stored in the specified directory.

| Result File | Description |
|:----------- |:----------- |
| `cellfie_scores.tsv` | Main result file containing the metabolic activity score values. Columns represent the samples in the original gene expression file, and rows represent all the different metabolic tasks (stored in the `task_id` column). |
| `cellfie_binary_scores.tsv` | Secondary result file that will only be generated if the flag `--binary_scores` is specified. It has the same structure as the main result file, but contains the binary interpretation of the activity of a metabolic task (`0` if the task is considered inactive, `1` if the task is considered active). |

A standard run of the CellFie framework should produce a `cellfie_scores.tsv` file similar to the following:

```
task_id S1 S2 S3 S4
0 X1 0.076515 0.671443 0.050100 0.733470
1 X2 0.863416 0.561653 1.204112 1.253820
2 X3 1.354543 0.889970 1.586738 2.489626
3 X4 1.195976 1.961420 1.423547 1.644596
4 X5 1.554831 1.785477 1.541452 1.704194
.. ... ... ... ... ...
```

Metabolic tasks are stored using their internal IDs, and their metadata can easily retrieved at the [task_info/](https://github.com/bsc-life/mteapy/tree/main/task_info) folder at the MTEApy repository.
59 changes: 25 additions & 34 deletions index.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,18 @@
---
layout: default
id: home
title: Home
layout: home
nav_order: 1
---

# **MTEApy**

__ __ ___ __.....__
| |/ `.' `. .-'' '.
| .-. .-. ' .| / .-''"'-. `.
| | | | | | .' |_ / /________\ \ __
| | | | | | .' | | | .:--.'.
| | | | | | '--. .-' \ .-------------' / | \ |
| | | | | | | | \ '-.____...---. `" __ | |
|__| |__| |__| | | `. .' .'.''| |
| '.' `''-...... -' / / | |_
| / \ \._,\ '/
`'-' `--' `"

MTEApy is a Python library for **Metabolic Task Enrichment Analysis** (MTEA) that leverages the use of powerful contraint-based metabolic model frameworks. It uses metabolic tasks to inferr the metabolic states or changes using transcriptomic data.
***
**MTEApy** is a Python library for **Metabolic Task Enrichment Analysis** (MTEA) that leverages the use of powerful contraint-based metabolic model frameworks. It uses metabolic tasks to inferr the metabolic states or changes using transcriptomic data. This bundle of frameworks has been created to facilitate the access to contraint-based metabolic modeling approaches to researchers without the need of advanced bioinformatic skills in a simple Python library.


## **Installation**
## Installation

To install MTEApy, you can install it using `pip`:
To install MTEApy, you can do so using `pip`:

```sh
pip install mteapy
@@ -35,26 +25,29 @@ git clone https://github.com/bsc-life/mteapy/
pip install -e mteapy/
```

## **Overview**
## Overview

MTEApy is comprised of two main contraint-based metabolic modeling frameworks, TIDE and CellFie, implemented in Python (the original source codes are published in Matlab at their respective repositories). Each framework runs using different types of input files.

| Framework | Original Code | Description |
| --------- | ------------- | ----------- |
| **CellFie** [[1](#references)] | [LewisLabUCSD/CellFie](https://github.com/LewisLabUCSD/CellFie) | Utilises a normalized expression matrix (e.g., TPMs) to compute a gene activity score using user-defined thresholds, and then projects it into metabolic reactions. Using the participating reactions for each metabolic task, a metabolic score is computed. |
| **TIDE** [[2](#references)] | [csbl/iCardio](https://github.com/csbl/iCardio) | Utilises a differential expression result and its log-FC values to project them into metabolic reactions. Using the participating reactions for each metabolic task, a metabolic score is computed. A p-value is assigned to each score after performing a permutation test. |
| **TIDE-essential** | [bsc-life/mteapy](https://github.com/bsc-life/mteapy) | Utilises a differential expression result, its log-FC and essential genes to metabolic tasks to compute a metabolic score. A p-value is assigned to each score after performing a permutation test. |
|:--------- |:------------- |:----------- |
| **CellFie** [[1](#references)] | [LewisLabUCSD/CellFie](https://github.com/LewisLabUCSD/CellFie) | Utilises a normalized expression matrix (e.g., TPMs) to compute a gene activity score using user-defined thresholds, and then projects it into metabolic reactions. Using the participating reactions for each metabolic task, a metabolic score is computed which indicates the metabolic activity of the metabolic tasks across samples. |
| **TIDE** [[2](#references)] | [csbl/iCardio](https://github.com/csbl/iCardio) | Utilises a differential expression result and its log-FC values to project them into metabolic reactions. Using the participating reactions for each metabolic task, a metabolic score is computed which indicates the change in metabolic activity for one control-sample. A p-value is assigned to each score after performing a permutation test. |
| **TIDE-essential** | [bsc-life/mteapy](https://github.com/bsc-life/mteapy) | Utilises a differential expression result, its log-FC and essential genes to metabolic tasks to compute a metabolic score which indicates the change in metabolic activity for one control-sample. A p-value is assigned to each score after performing a permutation test. |

MTEApy is designed to be used both as a command-line tool and as a Python module in a Jupyter Notebook or Python script. By default, the metabolic model used by the command is the Human-GEM [[3](#references)].

MTEApy is designed to be used both as a command-line tool and as a Python module in a Jupyter Notebook or Python script.
{: .note}
Only the **Human-GEM** and its metabolic tasks are implemented, so the framework will only take in **EnsemblIDs** as valid genic nomenclature. We are working to allow for any metabolic model and metabolic tasks to be used for more customisable analyses!

### Command-line

If used as a command-line tool, run the command `run-mtea` and specify the desired framework. By default, the metabolic model used by the command is the Human-GEM [[3](#references)] and, therefore, the metabolic tasks are also compatible with Human-GEM.
If used as a command-line tool, run the command `run-mtea` and specify the desired framework.

```sh
run-mtea [-h] [-v] [-c] [-t] [-s] {TIDE-essential,TIDE,CellFie}
```
For more details on the input parameters, run the `-h` or `--help` after any of the commands.
For more details on the input parameters, run the `-h` or `--help` after any of the commands or see the dedicated pages.

### Python module

@@ -65,22 +58,20 @@ from mteapy.tide import compute_TIDE, compute_TIDEe
from mteapy.cellfie import compute_CellFie
```

## **Tutorials**
## Citation

[TO DO]
> Comming soon!
- [TIDE/TIDE-essential]()
- [CellFie]()

## **References**
## References

1. Richelle, A.; Kellman, B.P.; Wenzel, A.T.; Chiang, A.W.; Reagan, T.; Gutierrez, J.M.; Joshi, C.; Li, S.; Liu, J.K.; Masson, H.; _et al._ Model-based assessment of mammalian cell metabolic functionalities using omics data. _Cell Reports Methods_ **2021**, 1, 100040. https://doi.org/10.1016/j.crmeth.2021.100040.
2. Dougherty, B.V.; Rawls, K.D.; Kolling, G.L.; Vinnakota, K.C.; Wallqvist, A.; Papin, J.A. Identifying functional metabolic shifts in heart failure with the integration of omics data and a heart-specific, genome-scale model. _Cell Reports_ **2021**, 34, 108836. https://doi.org/10.1016/j.celrep.2021.108836.
3. Robinson, J.L.; Kocabaş, P.; Wang, H.; Cholley, P.E.; Cook, D.; Nilsson, A.; Anton, M.; Ferreira, R.; Domenzain, I.; Billa, V.; _et al_. An atlas of human metabolism. _Science Signaling_ **2020**, 13, eaaz1482. https://doi.org/10.1126/scisignal.aaz1482.
1. Richelle, A.; Kellman, B.P.; Wenzel, A.T.; Chiang, A.W.; Reagan, T.; Gutierrez, J.M.; Joshi, C.; Li, S.; Liu, J.K.; Masson, H.; _et al._ Model-based assessment of mammalian cell metabolic functionalities using omics data. _Cell Reports Methods_ **2021**, 1, 100040. [https://doi.org/10.1016/j.crmeth.2021.100040](https://doi.org/10.1016/j.crmeth.2021.100040).
2. Dougherty, B.V.; Rawls, K.D.; Kolling, G.L.; Vinnakota, K.C.; Wallqvist, A.; Papin, J.A. Identifying functional metabolic shifts in heart failure with the integration of omics data and a heart-specific, genome-scale model. _Cell Reports_ **2021**, 34, 108836. [https://doi.org/10.1016/j.celrep.2021.108836](https://doi.org/10.1016/j.celrep.2021.108836).
3. Robinson, J.L.; Kocabaş, P.; Wang, H.; Cholley, P.E.; Cook, D.; Nilsson, A.; Anton, M.; Ferreira, R.; Domenzain, I.; Billa, V.; _et al_. An atlas of human metabolism. _Science Signaling_ **2020**, 13, eaaz1482. [https://doi.org/10.1126/scisignal.aaz1482](https://doi.org/10.1126/scisignal.aaz1482).


***
## **Contact**
## Contact

- Xavier Benedicto Molina ([xavier.benedicto@bsc.es](mailto:xavier.benedicto@bsc.es))
- Miguel Ponce-de-León ([miguel.ponce@bsc.es](mailto:miguel.ponce@bsc.es))
96 changes: 88 additions & 8 deletions tide.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,95 @@
---
title: The TIDE Framework
title: TIDE
layout: default
id: tide
nav_order: 2
---

# **The TIDE Framework**
# **The TIDE framework**
{: .no_toc }
***

The **TIDE framework** was originally published by Dougherty _et al._ in 2021 (see [https://doi.org/10.1016/j.celrep.2021.108836](https://doi.org/10.1016/j.celrep.2021.108836)).
## Table of contents
{: .no_toc .text-delta }
1. TOC
{:toc}
***

## **Command-line arguments**
## Description

| Argument | Shorcut | Description |
| -------- | ------- | ----------- |
|
The **T**ask **I**nferred from **D**ifferential **E**xpression (TIDE) framework is a contraint-based metabolic modeling framework that was originally published by [Dougherty _et al._, 2021](https://doi.org/10.1016/j.celrep.2021.108836). It leverages the use of mathematical descriptions of metabolic functions (metabolic tasks) and the results of a Differential Expression Analysis to study metabolic perturbations in a case control assay.

## Command options

| Argument | Shortcut | Description | Default |
|:-------- |:-------- |:----------- |:------- |
| `dea_file` | | Filename for a differential expression analysis results file. It should contain at least three columns: genic (string), log-FC (numeric) and significance (numeric, e.g.: p-value, adjusted p-value, FDR). | |
| `--delim` | `-d` | Field delimiter for inputed file. | `\t` |
| `-out` | `-o` | Name (and location) to store the analysis' results. They will be stored in a tab-sepparated file, so filenames should contain the `.tsv` or `.txt` extensions. | `tide_results.tsv` |
| `--gene_col` | | Name of the column in the inputed file containing gene names/symbols. | `geneID` |
| `--lfc_col` | | Name of the column in the inputed file containing log-FC values. | `log2FoldChange` |
| `--pvalue_col` | | Name of the column in the inputed file containing significance values. Only required if the flag `--mask_lfc_values` is `True`. | `padj` |
| `--alpha` | `-a` | Significance threshold to mask log-FC. Only required if the flag `--mask_lfc_values` is `True`. | `0.05` |
| `--n_permutations` | `-n` | Number of permutations to infer p-values for the metabolic scores. The resolution of the computed p-values will depend on this number. | `1000` |
| `--n_cpus` | | Number of CPUs for parallel execution. | `1` |
| `--or_func` | | Name of the function that will be used to resolve OR relationships in gene-protein-reaction (GPR) rules. Possible values are `absmax`, which will return the absolute maximum value, and `max`, which will return the maximum value. | `absmax` |
| `--mask_lfc_values` | | Flag to indicate whether to mask log-FC values to 0 according to their significance. That is, if a log-FC value is non-significant (determined by the user), they will be masked to 0. | `False` |
| `--random_scores` | | Flag to indicate whether to return the null distribution of random scores used to inferr significance with the results file. | `False` |

## Usage example

### Differential expression analysis

The first thing that the TIDE framework requires is a Differential Expression Analysis (DEA) result. Usually, this kind of data is stored in a tabular format and contains at least three columns: gene names/symbols, expression change values (log-FC) and significancy (p-value).

A typical DEA result will look like the following:

```
geneID geneSymbol log2FoldChange padj
0 ENSG00000000003 TSPAN6 3.710229 0.259406
1 ENSG00000000005 TNMD -2.437056 0.485180
2 ENSG00000000419 DPM1 8.749658 0.802934
3 ENSG00000000457 SCYL3 -10.409959 0.051220
4 ENSG00000000460 FIRRM -0.977916 0.926198
... ... ... ... ...
```

### Running TIDE

To run the TIDE framework using the command-line, the command `run-mtea TIDE` should be used with the desired arguments. A typical TIDE analysis is run using a range of `1,000` to `10,000` permutations, the `absmax` function to evaluate OR GPR rules, and selecting the `--mask_lfc_values` flag, which will mask non-significant log-FC values to 0.

{: .note}
Only the **Human-GEM** and its metabolic tasks are implemented, so the framework will only take in **EnsemblIDs** as valid genic nomenclature. We are working to allow for any metabolic model and metabolic tasks to be used for more customisable analyses!

```sh
run-mtea TIDE dea_file.tsv \
-o results/tide_results.tsv \
-n 1000 \
--n_cpus 4 \
--or_func absmax \
--gene_col geneID \
--lfc_col log2FoldChange \
--pvalue_col padj \
-a 0.05 \
--mask_lfc_values
```

### Understanding the TIDE results

Once the analysis is run, a tabular file containing the analysis results will be saved into the inputed location. The results file will contain 7 columns: a task ID, the metabolic score, the mean random score obtained during the permutation test, its associated p-value, and three more columns detailing the metabolic task description, metabolic system and subsystem.

```
task_id score random_score pvalue task_description metabolic_system metabolic_subsystem
0 X159 1.150921 -0.201177 0.000 Linolenate degradation Lipids Metabolism Fatty Acid Metabolism
1 X164 1.216932 -0.185784 0.001 Arachidonate degradation Lipids Metabolism Fatty Acid Metabolism
2 X160 1.026041 -0.184708 0.001 Linoleate degradation Lipids Metabolism Fatty Acid Metabolism
3 X107 1.228656 -0.178454 0.001 Conversion of lysine to L-2-Aminoadipate Amino Acids Metabolism Lysine Metabolism
4 X162 0.857837 -0.182584 0.001 gamma-Linolenate degradation Lipids Metabolism Fatty Acid Metabolism
.. ... ... ... ... ... ... ...
```

The results can then be used to explore the metabolic changes of a case-control sample.

## The TIDE-essential framework

{: .highlight}
Under construction! Please, come back soon.

0 comments on commit fb16836

Please sign in to comment.