Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ready to merge: PopPUNK #118

Open
wants to merge 86 commits into
base: evfi_IGV
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
86 commits
Select commit Hold shift + click to select a range
a5dfd37
create poppunk
evezeyl Mar 19, 2019
757e0e1
Create ResFinder.md
ajkarloss Mar 22, 2019
b39147b
Update ResFinder.md
ajkarloss Mar 22, 2019
a8183ef
Update ResFinder.md
ajkarloss Mar 22, 2019
964b0ca
Update ResFinder.md
ajkarloss Mar 22, 2019
3d2e6cc
Create PlasmidFinder.md
ajkarloss Mar 22, 2019
61965ae
Update ResFinder.md
ajkarloss Mar 22, 2019
2533007
Update PlasmidFinder.md
ajkarloss Mar 22, 2019
c510c2f
Update PlasmidFinder.md
ajkarloss Mar 22, 2019
2d510c4
Create VirulanceFiner.md
ajkarloss Mar 22, 2019
fd88b13
Update ResFinder.md
ajkarloss Mar 22, 2019
28f6f3e
Update PlasmidFinder.md
ajkarloss Mar 22, 2019
be982dc
Create VirulanceFinder.md
ajkarloss Mar 22, 2019
0f069c9
wrong file name no need anymore
ajkarloss Mar 22, 2019
ff383ec
Update ResFinder.md
ajkarloss Mar 25, 2019
eb34b82
Update ResFinder.md
ajkarloss Mar 25, 2019
cd0f713
Update VirulanceFinder.md
ajkarloss Mar 25, 2019
66171d7
Update VirulanceFinder.md
ajkarloss Mar 25, 2019
b7973c2
added PointFinder DB
ajkarloss Mar 26, 2019
1ea57af
backup
evezeyl Mar 26, 2019
dd3ef80
Create PointFinder.md
ajkarloss Mar 28, 2019
8c92573
Update ResFinder.md
ajkarloss Mar 28, 2019
2fce735
Create ShigaTyper.md
ajkarloss Mar 28, 2019
e0b63bd
Update ShigaTyper.md
ajkarloss Mar 28, 2019
2dae08b
update poppunk
evezeyl Mar 29, 2019
a204a90
Merge pull request #117 from NorwegianVeterinaryInstitute/evfi_IGV
karinlag Apr 1, 2019
62bc922
Update README.md
karinlag Apr 1, 2019
2f18b5f
Figure is not correctly formated in the document, for now I commented…
Thomieh73 Apr 1, 2019
2bf8bc8
Merge pull request #119 from Thomieh73/master
Thomieh73 Apr 1, 2019
fa7b688
corrected fig format to md
evezeyl Apr 1, 2019
07901cb
Create SeqSero.md
ajkarloss Apr 4, 2019
5cafa70
Create SeroTypeFinder.md
ajkarloss Apr 4, 2019
d0bfa22
Update PlasmidFinder.md
ajkarloss Apr 4, 2019
0a3a7a3
Update PointFinder.md
ajkarloss Apr 4, 2019
2571471
Update ResFinder.md
ajkarloss Apr 4, 2019
41495ef
Update ResFinder.md
ajkarloss Apr 4, 2019
81d59ab
Update ResFinder.md
ajkarloss Apr 4, 2019
d90cebe
Update PlasmidFinder.md
ajkarloss Apr 4, 2019
edce57f
Update PointFinder.md
ajkarloss Apr 4, 2019
646c192
Update SeqSero.md
ajkarloss Apr 4, 2019
07205b3
Update SeroTypeFinder.md
ajkarloss Apr 4, 2019
e35438b
Update ShigaTyper.md
ajkarloss Apr 4, 2019
14924b0
Update VirulanceFinder.md
ajkarloss Apr 4, 2019
e0b1823
Update PlasmidFinder.md
ajkarloss Apr 5, 2019
c98ca3e
Update PlasmidFinder.md
ajkarloss Apr 5, 2019
b17c912
Update PlasmidFinder.md
ajkarloss Apr 5, 2019
d111803
Update PlasmidFinder.md
ajkarloss Apr 5, 2019
312c3d7
Update PlasmidFinder.md
ajkarloss Apr 5, 2019
59a302f
Update PlasmidFinder.md
ajkarloss Apr 5, 2019
b07cc55
Update PointFinder.md
ajkarloss Apr 5, 2019
aaec97c
Update PointFinder.md
ajkarloss Apr 5, 2019
fe85ca8
Update ResFinder.md
ajkarloss Apr 5, 2019
65e768f
Update SeqSero.md
ajkarloss Apr 5, 2019
b9a3bed
Update SeroTypeFinder.md
ajkarloss Apr 5, 2019
2b5ff6b
Update ShigaTyper.md
ajkarloss Apr 5, 2019
6b13e47
Update VirulanceFinder.md
ajkarloss Apr 5, 2019
bbc088b
Update VirulanceFinder.md
ajkarloss Apr 5, 2019
440f74c
added the modified instruction for md5check
evezeyl Apr 5, 2019
3682d9e
Update SeroTypeFinder.md
ajkarloss Apr 5, 2019
5da7dbf
Update PlasmidFinder.md
ajkarloss Apr 5, 2019
e95f0d1
Update PlasmidFinder.md
ajkarloss Apr 5, 2019
e97f167
Update PointFinder.md
ajkarloss Apr 5, 2019
1ede767
Update ResFinder.md
ajkarloss Apr 5, 2019
fb47266
Update SeqSero.md
ajkarloss Apr 5, 2019
7cbbb55
Update SeroTypeFinder.md
ajkarloss Apr 5, 2019
7bb9c59
updated poppunk - backup
evezeyl Apr 5, 2019
e0f877c
Update ShigaTyper.md
ajkarloss Apr 5, 2019
49ce533
added workflow fig
evezeyl Apr 6, 2019
2bd364d
on the way to be clear
evezeyl Apr 6, 2019
25aed9c
backup
evezeyl Apr 7, 2019
8dca419
almost there with poppunk
evezeyl Apr 9, 2019
a8a8b3b
backup
evezeyl Apr 12, 2019
6b43ad1
backup
evezeyl Apr 17, 2019
2cb17ba
Create pMLST.md
ajkarloss Apr 23, 2019
4294a3b
create
evezeyl Apr 26, 2019
8bd6b48
Merge branch 'hk_ef_R_trees' of https://github.com/NorwegianVeterinar…
evezeyl Apr 26, 2019
4bd8771
backup
evezeyl Apr 26, 2019
4f5e369
reorganised visualisation
evezeyl Apr 30, 2019
90b9eae
nearly finished
evezeyl Apr 30, 2019
f65bee0
ok - need an external reviewer now
evezeyl Apr 30, 2019
86db8f2
improved
evezeyl Apr 30, 2019
2465a7a
clarified origin
evezeyl Apr 30, 2019
901a912
typo
evezeyl Apr 30, 2019
4023d2c
clarrified
evezeyl Apr 30, 2019
a7eb6e4
added training dataset
evezeyl May 1, 2019
e5bae8c
minor improvement
evezeyl May 1, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions PlasmidFinder.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
**PlasmidFinder**
-------------------------
PlasmidFinder identifies plasmids in total or partial sequenced isolates of bacteria.
Contact Jeevan in slack if you have any issues or further assistance (F. ex. run the tool for multiple isolates).

#### User Manual
https://bitbucket.org/genomicepidemiology/plasmidfinder/src

#### For further reading
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4068535/

#### Here is the EXAMPLE SLURM script for Abel to excute the tool.
Important rules to follow
* Refer the user manual for all the parameters in the tool
* Keep your data in /project/nn9305k/
* Store your resutls also in /project/nn9305k/
* Execute the script from your home directory

```
#!/bin/bash
#SBATCH --job-name=DontKillMe
#SBATCH --account=nn9305k
#SBATCH --time=01:00:00
#SBATCH --mem-per-cpu=32G

## Set up job environment:
source /cluster/bin/jobsetup

# Activate Conda environment
conda activate PlasmidFinder

# Database location
DB="/work/projects/nn9305k/src/PlasmidFinder/PlasmidFinder_DB/plasmidfinder_db/"

# Note: Dont need to mention the BLAST location
python /work/projects/nn9305k/src/PlasmidFinder/plasmidfinder/plasmidfinder.py -p $DB -i input_file -o output_file

# deactivate Conda environment
conda deactivate
```
34 changes: 34 additions & 0 deletions PointFinder.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
**Executing PointFinder**
-------------------------
The tool detects chromosomal mutations predictive of drug resistance based on WGS data.

Contact Jeevan in slack if you have any issues or further assistance (F. ex. run the tool for multiple isolates).

#### User Manual
https://bitbucket.org/genomicepidemiology/pointfinder

#### Here is the EXAMPLE SLURM script for Abel to excute the tool.
Important rules to follow
* Refer the user manual for all the parameters in the tool
* Keep your data in /project/nn9305k/
* Store your resutls also in /project/nn9305k/
* Execute the script from your home directory

```
#!/bin/bash
#SBATCH --job-name=DontKillMe
#SBATCH --account=nn9305k
#SBATCH --time=01:00:00
#SBATCH --mem-per-cpu=32G

## Set up job environment:
source /cluster/bin/jobsetup

conda activate PointFinder

# Database location
PointFinder_DB="/work/projects/nn9305k/src/PointFinder_DB/src/"
python PointFinder.py -p $PointFinder_DB -i input_file -o output_file

conda deactivate
```
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ The instructors are:

* [Use of mash](mash.md)
* [Use of conda](conda.md)
* [Mapping and visualization](assembly_visualization.md)


## Course pages
Expand Down
16 changes: 16 additions & 0 deletions R_trees.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
start with data prep

# Data:


# Required R packages:
`distanceR`
`ggtree`


# make a simple tree with just labels

# Add decorations

each step has to build on the previous one
but you need to get something on screen, very fast
38 changes: 38 additions & 0 deletions ResFinder.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
**Executing ResFinder**
----------------------
Use the below code to execute ResFinder Abel. Dont need to mention BLAST location.
Contact Jeevan if you have any issues.

PlasmidFinder identifies plasmids in total or partial sequenced isolates of bacteria.

Contact Jeevan in slack if you have any issues or further assistance (F. ex. run the tool for multiple isolates).

#### User Manual
https://bitbucket.org/genomicepidemiology/pointfinder

#### Here is the EXAMPLE SLURM script for Abel to excute the tool.
Important rules to follow
* Refer the user manual for all the parameters in the tool
* Keep your data in /project/nn9305k/
* Store your resutls also in /project/nn9305k/
* Execute the script from your home directory

```
#!/bin/bash
#SBATCH --job-name=DontKillMe
#SBATCH --account=nn9305k
#SBATCH --time=01:00:00
#SBATCH --mem-per-cpu=32G

## Set up job environment:
source /cluster/bin/jobsetup

conda activate ResFinder

# Location of PointFinder DB
PF_DB="/work/projects/nn9305k/src/ResFinder/ResFinderDB/src/"

python /work/projects/nn9305k/src/ResFinder/src/resfinder.py -i <Input File> -p /work/projects/nn9305k/src/ResFinder/ResFinderDB/src/ -k /work/projects/nn9305k/src/kma/ -o Output

conda deactivate
```
33 changes: 33 additions & 0 deletions SeqSero.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
**Excuting SeqSero**
--------------------
SeqSero is a pipeline for Salmonella serotype determination from raw sequencing reads or genome assemblies.

Contact Jeevan in slack if you have any issues or further assistance (F. ex. run the tool for multiple isolates).

#### User Manual
https://github.com/denglab/SeqSero

#### For further reading
http://jcm.asm.org/content/early/2015/03/05/JCM.00323-15

#### Here is the EXAMPLE SLURM script for Abel to excute the tool.
Important rules to follow
* Refer the user manual for all the parameters in the tool
* Keep your data in /project/nn9305k/
* Store your resutls also in /project/nn9305k/
* Execute the script from your home directory

```
#!/bin/bash
#SBATCH --job-name=DontKillMe
#SBATCH --account=nn9305k
#SBATCH --time=01:00:00
#SBATCH --mem-per-cpu=32G

## Set up job environment:
source /cluster/bin/jobsetup

conda activate SeqSero_Shared
python /work/projects/nn9305k/src/SeqSero/SeqSero/SeqSero.py -m "1" -i input_file -b "mem"
conda deactivate
```
32 changes: 32 additions & 0 deletions SeroTypeFinder.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
**Executing SeroTypeFinder**
============================
SerotypeFinder identifies the serotype in total or partial sequenced isolates of E. coli.

Contact Jeevan in slack if you have any issues or further assistance (F. ex. run the tool for multiple isolates).

#### User Manual
https://bitbucket.org/genomicepidemiology/serotypefinder/src/master/


#### Here is the EXAMPLE SLURM script for Abel to excute the tool.
Important rules to follow
* Refer the user manual for all the parameters in the tool
* Keep your data in /project/nn9305k/
* Store your resutls also in /project/nn9305k/
* Execute the script from your home directory

```
#!/bin/bash
#SBATCH --job-name=DontKillMe
#SBATCH --account=nn9305k
#SBATCH --time=01:00:00
#SBATCH --mem-per-cpu=32G

## Set up job environment:
source /cluster/bin/jobsetup

conda activate SeroTyperFinder
DB=/work/projects/nn9305k/src/SeroTypeFinder/serotypefinder_db/
python /work/projects/nn9305k/src/SeroTypeFinder/serotypefinder.py -i input_file -o output -p $DB -mp /work/projects/nn9305k/src/kma/kma
conda deactivate
```
33 changes: 33 additions & 0 deletions ShigaTyper.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
**ShighaTyper**
---------------
ShigaTyper is a quick and easy tool designed to determine Shigella serotype using Illumina paired end reads with low computation requirement.

Contact Jeevan in slack if you have any issues or further assistance (F. ex. run the tool for multiple isolates).

#### User Manual
https://bitbucket.org/genomicepidemiology/plasmidfinder/src

#### For further reading
https://aem.asm.org/content/85/7/e00165-19

#### Here is the EXAMPLE SLURM script for Abel to excute the tool.
Important rules to follow
* Refer the user manual for all the parameters in the tool
* Keep your data in /project/nn9305k/
* Store your resutls also in /project/nn9305k/
* Execute the script from your home directory

```
#!/bin/bash
#SBATCH --job-name=DontKillMe
#SBATCH --account=nn9305k
#SBATCH --time=01:00:00
#SBATCH --mem-per-cpu=32G

## Set up job environment:
source /cluster/bin/jobsetup

conda activate ShigaTyper
python /work/projects/nn9305k/src/ShigaTyper/shigatyper/shigatyper.py Read1 Read2 -n sample_name
conda deactivate
```
36 changes: 36 additions & 0 deletions VirulanceFinder.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
**Executing VirulanceFinder**
-----------------------------
VirulenceFinder identifies viruelnce genes in total or partial sequenced isolates of bacteria - at the moment only E. coli, Enterococcus, S. aureus and Listeria are available.

Contact Jeevan in slack if you have any issues or further assistance (F. ex. run the tool for multiple isolates).

#### User Manual
https://bitbucket.org/genomicepidemiology/virulencefinder

#### For further reading
https://www.ncbi.nlm.nih.gov/pubmed/24574290

#### Here is the EXAMPLE SLURM script for Abel to excute the tool.
Important rules to follow
* Keep your data in /project/nn9305k/
* Store your resutls also in /project/nn9305k/
* Execute the script from your home directory

```
#!/bin/bash
#SBATCH --job-name=DontKillMe
#SBATCH --account=nn9305k
#SBATCH --time=01:00:00
#SBATCH --mem-per-cpu=32G

## Set up job environment:
source /cluster/bin/jobsetup

module load Miniconda3/4.4.10

# Activate Conda environment
conda activate VirulanceFinder
DB="/work/projects/nn9305k/src/VirulanceFinder/Virulance_DB/virulencefinder_db"
python /work/projects/nn9305k/src/VirulanceFinder/src/virulencefinder.py -p $DB
conda deactivate
```
10 changes: 2 additions & 8 deletions assembly_visualization.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,7 @@ Usually few mismatches are allowed (think about the consequences).

Reads can be mapped as paired or single. If paired is used, then the matching regions are defined by the insert size and the length of each read

<p align="center">
<a href="https://commons.wikimedia.org/wiki/File:Mapping_Reads.png"> <img src="https://upload.wikimedia.org/wikipedia/commons/2/2e/Mapping_Reads.png" width="400">
<br>
</p>
![https://commons.wikimedia.org/wiki/File:Mapping_Reads.png](https://upload.wikimedia.org/wikipedia/commons/2/2e/Mapping_Reads.png)

### 1.2 Why mapping reads

Expand Down Expand Up @@ -170,10 +167,7 @@ In `Bifrost` we annotated the assembly with `Prokka` (using annotations derived

## 2.4 Loading files in [IGV](https://software.broadinstitute.org/software/igv/)

<p align="center">
<img src="./figures/IGV.png">
<br>
</p>
![IGV](./figures/IGV.png)

1. Create a `genome file` this allows associating tracks to the assembly : `Genomes > create.genome file`. Use the menu to select your assembly file `.fasta`and the annotation-gene file: `.gff`

Expand Down
Binary file added figures/poppunk/2DGMM_fit.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added figures/poppunk/2DGMM_fit_DPGMM_fit.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added figures/poppunk/2DGMM_refine_refined_fit.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added figures/poppunk/DBSCAN_fit_dbscan.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added figures/poppunk/poppunk_random_k.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added figures/poppunk/refine_poppunk.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
38 changes: 38 additions & 0 deletions pMLST.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
**Executing pMLST**
----------------------
Use the below code to execute pMLST in Abel.

pMLST enables investigators to determine the ST based on WGS data.

Contact Jeevan in slack if you have any issues or further assistance (F. ex. run the tool for multiple isolates).

#### User Manual
https://bitbucket.org/genomicepidemiology/pmlst/src

#### Here is the EXAMPLE SLURM script for Abel to excute the tool.
Important rules to follow
* Refer the user manual for all the parameters in the tool
* Keep your data in /project/nn9305k/
* Store your resutls also in /project/nn9305k/
* Execute the script from your home directory

```
#!/bin/bash
#SBATCH --job-name=DontKillMe
#SBATCH --account=nn9305k
#SBATCH --time=01:00:00
#SBATCH --mem-per-cpu=32G

## Set up job environment:
source /cluster/bin/jobsetup

conda activate pMLST

# Location of pMLST DB
pMLST_DB="/work/projects/nn9305k/src/pMLST/pMLST_DB/pmlst_db/"

python3 /work/projects/nn9305k/src/pMLST/src/pmlst.py -i <Input File> -p $pMLST_DB -o Output

conda deactivate
```

Loading