mir-seek

Patch: updating steps for novel miR identification and quantification

About

By default, miRDeep2 produces identifiers for novel miRs that are un-informative and not human-readable. Internally, miRDeep2 will generate a string that is a combination of the chromosome on which the miR was found and an internal counter to miRDeep2. As so, the first miR will have an identifier as follows: chrN_1.

Here is an example identifier from the v0.3.0 version of the pipeline:

chr20__AC:CM000682.2__gi:568336004__LN:64444167__rl:Chromosome__M5:b18e6c531b0bd70e949a7fc20859cb01__AS:GRCh38_43554

Note, the extra metadata from the sequence identifier in the genomic fasta file is also included. This patch aims to rename the novel identifiers produced by miRDeep2 into a format that is more human-readable. The new renamed identifiers will contain the following information: chr, start, stop, strand. This patch also accounts for any 1:M relationships between novel mature & precursor mIRs, in a similar manner to how we account for this for known miRs.

Here is an example of a new identifier produced in v0.3.1:

novel_mir_chr19_29605226_29605283_reverse_strand

Full Changelog: v0.3.0...v0.3.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.3.1

mir-seek

Patch: updating steps for novel miR identification and quantification

About