-
Notifications
You must be signed in to change notification settings - Fork 9
/
Copy pathexample_samplesheet.yaml
67 lines (66 loc) · 2.64 KB
/
example_samplesheet.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
# This is an exemplification of the accepted YAML syntax accepted by the bacannot pipeline
#
# The file must contain a header-line with the key "samplesheet:". All the
# input samples must be given nested to this key. Obs: The header-line and
# the nest indentation are REQUIRED.
#
# Each sample must initiate with the key "id:" listed with a "-". This value
# is what will be used as prefix for outputs. The input options (raw reads or
# assembled genomes) must be give right after this key, with the same indent
# without the "-".
#
# The possible inputs are:
#
# (obs: When using both paired and unpaired short reads for the same sample,
# the order: pair1, pair2, unpaired must be followed. Otherwise, just the
# unpaired read or the pairs 1 and 2)
# illumina: (illumina short reads)
# - pair_1.fq
# - pair_2.fq
# - unpaired.fq
#
# nanopore: ont_nanopore.fastq (Oxford nanopore long reads)
# fast5: path/to/fast5_pass (Path to dir containing ONT FAST5s for methylation calling)
# If using an already assembled genome, users may give the nanopore and fast5 data to call methylation
#
# pacbio: pacbio.fastq (Pacbio long reads)
#
# assembly: assembled_genome.fasta (assembled genome)
#
# resfinder: species panel (one of the available resfinder species panels)
#
# Users can give for each sample either the assembled genome or a combination
# of short and long reads (to perform a short reads only, long reads only or
# hybrid assembly of raw reads before annotation)
#
# Obs: the combination of ONT and pacbio reads for the same sample is not supported,
# it must either be one or another.
#
# Obs: The "illumina:" key is the only one the MUST be give in new idented lines
# started by "-". All the others (nanopore, pacbio, assembly, fast5, resfinder)
# must be given right after the ":", without newlines, quotes or signs.
#
# A template (with the correct fields, syntax and indentation) is given below:
samplesheet:
- id: sample_1
illumina:
- sample_1/1.fastq
- sample_1/2.fastq
nanopore: sample_1/ont.fastq
resfinder: Escherichia coli # this tells the pipeline that this specific sample should be treated as E. coli despite any other default set
- id: sample_2
assembly: sample_2/assembly.fasta
nanopore: sample_2/ont.fastq
fast5: sample_2/fast5_pass # will call methylation on assembled genome
- id: sample_3
nanopore: sample_3/ont.fastq
fast5: sample_3/fast5_pass # will also call methylation flye genome assembly
- id: sample_4
pacbio: sample_4/pacbio.fastq
illumina:
- sample_4/merged_unpaired.fastq
- id: sample_5
illumina:
- sample_5/1.fastq
- sample_5/2.fastq
- sample_5/merged.fastq