Skip to content

Commit

Permalink
Update vignette
Browse files Browse the repository at this point in the history
  • Loading branch information
adeschen committed Oct 21, 2024
1 parent 54b64ca commit 4d63292
Showing 1 changed file with 31 additions and 7 deletions.
38 changes: 31 additions & 7 deletions vignettes/RAIDS.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ These steps are described in detail in the following.

### 1.1 Create a directory structure

First, a specific directory structure should be created. The structure must
First, a specific directory structure should be created. The structure must
correspond to this:

```
Expand All @@ -150,8 +150,8 @@ workingDirectory/

<br>

This following running example creates a temporary working directory structure
when the example will be run.
This following code creates a temporary working directory structure where the
example will be run.


```{r createDir, echo=TRUE, eval=TRUE, collapse=TRUE, warning=FALSE, message=FALSE}
Expand Down Expand Up @@ -306,14 +306,38 @@ data are used to optimize the inference parameters and, with these, the
ancestry of the input profile donor is inferred.

According to the type of input data (RNA or DNA), a specific function
is available.

The *inferAncestry()* function is used for DNA profiles while
is available. The *inferAncestry()* function is used for DNA profiles while
the *inferAncestryGeneAware()* function is RNA specific.

In this example, the profile is from DNA source and requires the use of the
*inferAncestry()* function.

The *inferAncestry()* function requires a specific profile input format. The
format is set by the *genoSource* parameter.

One of those formats is in a VCF format (*genoSource=c("VCF")*).
This format follows the VCF standard
with at least those genotype fields: _GT_, _AD_ and _DP_.
The SNVs must be germline variants and should include the genotype of the
wild-type homozygous at the selected positions in the reference. The VCF file
must be gzipped.

A generic SNP file can replace the VCF file (*genoSource=c("generic")*).
The format is coma separated and the mandatory columns are:

* _Chromosome_: The name of the chromosome
* _Position_: The position on the chromosome
* _Ref_: The reference nucleotide
* _Alt_: The aternative nucleotide
* _Count_: The total count
* _File1R_: The count for the reference nucleotide
* _File1A_: The count for the alternative nucleotide

Beware that the starting position in the **population reference GDS file** is
zero (like BED files). The generic SNP file should also start
at position zero.


```{r infere, echo=TRUE, eval=TRUE, collapse=TRUE, warning=FALSE, message=FALSE}
###########################################################################
Expand All @@ -330,7 +354,7 @@ if (requireNamespace("GenomeInfoDb", quietly=TRUE) &&
chrInfo <- GenomeInfoDb::seqlengths(genome)[1:25]
#######################################################################
## The SNP VCF file of the DNA profile donor
## The demo SNP VCF file of the DNA profile donor
#######################################################################
fileDonorVCF <- file.path(dataDir, "example", "snpPileup", "ex1.vcf.gz")
Expand Down

0 comments on commit 4d63292

Please sign in to comment.