Skip to content

Commit

Permalink
First commit of SCGPM Bcl2fastq App on DNAnexus
Browse files Browse the repository at this point in the history
  • Loading branch information
pbilling committed Mar 27, 2017
1 parent 5833c30 commit b8d186a
Show file tree
Hide file tree
Showing 14 changed files with 2,320 additions and 0 deletions.
44 changes: 44 additions & 0 deletions scgpm_bcl2fastq/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# SCGPM Bcl2fastq

## What does this app do?
This app converts BCL files generated from Illumina platforms into FASTQ files. It is used as part of the standard Stanford Center for Genomics and Personalized Medicine (http://med.stanford.edu/gssc.html) sequencing pipeline.
The app will process data from a single Illumina flowcell lane.

## What are typical use cases for this app?
After sequencing your library on an Illumina platform, use this app to convert the output into FASTQ files.

## What data are required for this app to run?

This app requires:
- A tar archive with sequencing lane files.
- A tar archive with sequencing metadata files.
- (optional) A text file with a list of barcodes.


The tarball files are generated automatically as part of the GSSC pipeline. If your data has already been sequencing through GSSC, you can use those tarballs found in the /raw_data directory of our DNAnexus project.

If not, you can generate the required tarballs with the following commands:

### Generate sequencing archive
$ cd <sequencing_run_directory>
$ tar -cf lane_archive.tar Data/Intensities/BaseCalls/L00N

### Generate metdata archive
$ cd <sequencing_run_directory>
$ tar -cvf metadata_archive.tar runParameters.xml RunInfo.xml RTAConfiguration.xml

## What does this app output?

- An array of FASTQ files.
- A lane.html file with basic library read statistics.
- A tools used files that describes the executables run to generate this data.
- (optional) A sample sheet describing the barcodes use for demultiplexing.

## How does this app work?

This app is essentially a wrapper for Illumina's bcl2fastq program. It does the extra work of automatically generating a sample sheet and base mask patterns used for demultipexing.

For more information, consult the manual at:

https://support.illumina.com/content/dam/illumina-support/documents/documentation/software_documentation/bcl2fastq/bcl2fastq2_guide_15051736_v2.pdf

191 changes: 191 additions & 0 deletions scgpm_bcl2fastq/dxapp.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,191 @@
{
"name": "scgpm_bcl2fastq",
"title": "SCGPM Bcl2fastq",
"summary": "Convert raw Illumina sequencing data to FASTQ files",
"dxapi": "1.0.0",
"version": "1.0.0",
"openSource": true,
"inputSpec": [
{
"name": "run_name",
"label": "Sequencing run name",
"help": "Used for naming auxilliary output files; i.e. 170320_GADGET_0105_BHG5CWBBXX",
"class": "string",
"default": "RunX",
"optional": false
},
{
"name": "lane_index",
"label": "Flowcell lane index",
"help": "Lane index will be 1-8 for most platforms. MiSeqs only have one lane so will always be 1.",
"class": "int",
"choices": [1,2,3,4,5,6,7,8],
"optional": false
},
{
"name": "project_folder",
"label": "Output location",
"help": "Project and folder location for output files: '/folder'",
"class": "string",
"optional": false,
"default": "/stage0_bcl2fastq"
},
{
"name": "lane_data_tar",
"label": "Lane tar file",
"class": "file",
"patterns": ["*.tar", "*.tar.gz"],
"optional": false
},
{
"name": "metadata_tar",
"label": "Metadata tar file",
"class": "file",
"patterns": ["*metadata.tar", "*metadata.tar.gz"],
"optional": false
},
{
"name": "barcodes_file",
"label": "Barcodes file",
"help": "Text file with one barcode per line: 'ATCGT-GTACT SampleA'",
"class": "file",
"patterns": ["*barcodes.txt"],
"optional": true
},
{
"name": "library_name",
"label": "Library name",
"help": "Used for naming fastq files",
"class": "string",
"default": "LibX",
"optional": false
},
{
"name": "properties",
"label": "Properties",
"help": "Dictionary of additional properties to be attached to all jobs & files",
"class": "hash",
"optional": true
},
{
"name": "tags",
"label": "Tags",
"help": "List of tags to be attached to jobs & files",
"class": "array:string",
"optional": true
},
{
"name": "use_bases_mask",
"label": "--use-bases-mask",
"class": "string",
"optional": true
},
{
"name": "barcode_mismatches",
"label": "--barcode-mismatches",
"help": "Number of barcode mismatches allowed",
"choices": [0, 1, 2],
"class": "int",
"optional": true,
"default": 1
},
{
"name": "fastq_for_index_reads",
"label": "--create-fastq-for-index-reads",
"class": "boolean",
"optional": true,
"default": true
},
{
"name": "ignore_missing_bcls",
"label": "--ignore-missing-bcls",
"help": "Missing or corrupt files are ignored. Assumes Passing Filter for all clusters in tiles where filter files are missing.",
"class": "boolean",
"optional": true,
"default": true
},
{
"name": "ignore_missing_positions",
"label": "--ignore-missing-positions",
"help": "(bcl2fastq2) Missing or corrupt positions files are ignored",
"class": "boolean",
"optional": true,
"default": true
},
{
"name": "ignore_missing_filter",
"label": "--ignore-missing-filter",
"help": "Missing or corrupt filter files are ignored. Assumes Passing Filter for all clusters in tiles where filter files are missing.",
"class": "boolean",
"optional": true,
"default": true
},
{
"name": "with_failed_reads",
"label": "--with-failed-reads",
"class": "boolean",
"optional": true,
"default": false
},
{
"name": "tiles",
"label": "--tiles",
"help": "(1112) Only used in test mode. See --tiles argument of the Illumina tool bcl2fastq. Takes a comma-separated list of regular expressions to select only a subset of the tiles available in the flow-cell.",
"class": "string",
"optional": true
}
],
"outputSpec": [
{
"name": "fastqs",
"label": "Fastq files",
"class": "array:file",
"optional": false
},
{
"name": "lane_html",
"lable": "Lane html file",
"class": "file",
"optional": false
},
{
"name": "tools_used",
"label": "Tools used",
"class": "file",
"optional": false
},
{
"name": "sample_sheet",
"label": "Sample sheet",
"class": "file",
"optional": true
}
],
"runSpec": {
"interpreter": "python2.7",
"file": "src/code.py",
"bundledDepends": [],
"execDepends": [],
"systemRequirements": {
"*": {
"instanceType": "mem1_ssd1_x16"
}
},
"distribution": "Ubuntu",
"release": "12.04"
},
"access": {
"allProjects": "ADMINISTER",
"network": ["*"]
},
"authorizedUsers": ["org-scgpm","org-cescg","org-snyder_encode","org-scgpm_pacbio"],
"details": {
"contactEmail": "pbilling@stanford.edu",
"upstreamUrl": "http://support.illumina.com/downloads/bcl2fastq-conversion-software-v217.html",
"upstreamVersion": "2.17.1.14",
"upstreamLicenses": [],
"whatsNew": "* 1.0.0: Initial version"
},
"categories": ["Read Manipulation"],
"developers": ["org-scgpm"]
}
Binary file not shown.
Binary file added scgpm_bcl2fastq/resources/usr/local/bin/bcl2fastq
Binary file not shown.
38 changes: 38 additions & 0 deletions scgpm_bcl2fastq/resources/usr/local/share/COPYRIGHT
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
BCL to FASTQ file converter
Copyright (c) 2007-2015 Illumina, Inc.

This software is covered by the accompanying EULA, and certain third party copyright/licenses, and any user of this
source file is bound by the terms therein.

The bcl2fastq distribution includes the following code libraries,
and are distributed according to the licensing terms governing each
library.

******************************************************************

CMake - Cross Platform Makefile Generator
Copyright 2000-2009 Kitware, Inc., Insight Software Consortium
All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
* Neither the names of Kitware, Inc., the Insight Software Consortium, nor the names of their contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
(C)2008-09 Copyright Kitware, Inc.

******************************************************************

Boost Software License - Version 1.0 - August 17th, 2003

Permission is hereby granted, free of charge, to any person or organization
obtaining a copy of the software and accompanying documentation covered by
this license (the "Software") to use, reproduce, display, distribute, execute, and transmit the Software, and to prepare derivative works of the Software, and to permit third-parties to whom the Software is furnished to do so, all subject to the following:

The copyright notices in the Software and this entire statement, including the above license grant, this restriction and the following disclaimer, must be included in all copies of the Software, in whole or in part, and all derivative works of the Software, unless such copies or derivative works are solely in the form of machine-executable object code generated by a source language processor.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

******************************************************************


Loading

0 comments on commit b8d186a

Please sign in to comment.