Releases: pdimens/harpy
Releases · pdimens/harpy
1.16.1
1.16.0
New
- turns out LEVIATHAN doesn't do any kind of internal deconvolution, so a new shim script was added to the leviathan workflow to deconvolve the
BX
tags in the input alignments based on the [already deconvolved]MI
tags.
Breaking
- this has been a long time coming, but
--conda
was swapped for--container
, meaningconda
workflow dependency handling is the default now and you can opt-in to using the container-based method .conda
has been renamed.environments
and will now house the conda environments and/or singularity container to simplify where harpy stores the software deps
1.15.0
New
- Quarto has replaced RMarkdown/Flexdashboard
- no changes for the user to worry about, but the reports will look a little different
- NXX plots for phasing report
- Introduced new scripts for development installation using Conda and Pixi
- Harpy's printing to console during runtime is sleeker now
Internal
- Streamlined Snakemake command execution for the different workflows
- Improved logging and error handling in various modules
molecule_coverage.py
now uses a sqlite3 backend, which dramatically reduces the amount required RAM- Refactored a few Snakemake workflow files
Bug Fixes
- Small bug reporting the wrong value for one of the valueboxes
1.14.3
Bugs fixed
- return missing haplotag barcode script that went missing after squashing commits and broke demuxing
Changed
- added rule priority for some workflows so they prioritize creating the output files over calculting metrics and writing reports
- this means that, for example,
align bwa
will prioritize creating all the output bam files, rather than running a single sample through everything
- this means that, for example,
Full Changelog: 1.14.2...1.14.3
1.14.2
1.14.1
Never too proud to admit I was wrong. I didnt wan't downsample
to be a snakemake workflow, but with the increased complexity of what I wanted it to do, I found myself writing an increasingly complex python script that was essentially doing all the stuff Snakemake was doing. So:
New
- Introduced a command-line utility for extracting barcodes from SAM/BAM files
- Enhanced phasing statistics reporting with new metrics (N50, N75, N90)
LRez
is now part of the main Harpy installation and accessible to the user- adapter removal in the
qc
module accepts an argument now, one of:auto
for automatic adapter detection- a FASTA file of adapters
Changed
- Downsampling is now a snakemake workflow
downsample
handles invalids in a much more intuitive (and sensible) way
Full Changelog: 1.14...1.14.1
1.14
New
- added a convenience script
separate_singletons
to split a bam file into singletons and nonsingletons harpy downsample
module to downsample FASTQ/BAM by barcodes
Breaking changes
- singletons are now calculated such that both reads of a paired-end read only counts as "one read" for a barcode
- which means unpaired reads now contribute properly to this value
- overall, this is a more accurate way of calculating this metric
Fixes
separate_validbx
has a usage change, which is breaking, however this script is not used by any of the workflows so there should be no appreciable difference- alignment reports have text that clarifies which math is for non-singletons
multiplex
reads (aka reads that arent linked-read singletons) are now just referred to asnon-singletons
1.13
New Features
- new
view
command to view workflow log, snakefile, or configuration file. - conda environment recipes are now stored in
outdir/workflow/envs
for more self-contained workflow directories- also improves workflow-specific troubleshooting
Breaking Changes
stitchparams
has been renamedimputeparams
Internal
- improved handling of conda environments across various commands, allowing for better configuration and dependency management.
- Updated environment directory paths for better organization and clarity across all workflows
- local simuG replaced with conda installation
- Removed dependency on the
simuG.pl
script for several simulation workflows, streamlining the execution process - rename rules and better directory structure for
simulate variants
- Removed dependency on the
Bug Fixes
- Improved regular expression handling in file processing to enhance clarity and prevent issues.
- Corrected typos in
align_stats.Rmd
and routines for handling no valid barcodes
Issues and PRs
- add harpy view by @pdimens in #166
- rebase with harpy view by @pdimens in #167
- swap simuG to conda-based install by @pdimens in #168
Full Changelog: 1.12...1.13
1.12
What's new (and important)
simulate linkedreads
now supports and defaults to haplotagging barcodes
- 84 million barcode options instead of 14m
- support for barcodes of any length, not just the 10X 16bp
- barcode sequencing error has been removed because you're ultimately interested in the linked read data, not the sequencing nuances
Internal
HaploSim.pl
(formerly LRSIM_harpy.pl) focuses solely on creating linked reads from provided haplotypes- output names for
simulate linkedreads
more flexible now - leveraged parameters better in
HaploSim.pl
- Added
haplotag_barcodes.py
to auto-generate haplotag barcodes - inline to haplotagging conversion uses memory-efficient in-memory sqlite3 database
- barcode validations for
align ema
andsimulate linkedreads
Bugs fixed
- [simulate linkedreads] barcode key generated as a fixed keymap, ensuring barcodes have same haplotag code between different haplotypes
What's Changed
- haplotagging barcodes as default by @pdimens in #162
- better sim demux support by @pdimens in #163
- fix param call by @pdimens in #164
Full Changelog: 1.11...1.12
1.11
New Features
- [sv leviathan] now also makes BX tags unique when concatenating population groups
- provided as
--bx
option toconcatenate_bam.py
- provided as
- new standalone script
deconvolve_alignments.py
that does the same thing asassign_mi.py
, but also deconvolves theBX
tag into hyphenated form
Fixes
- R logic for properly parsing new
--contigs
option #160
Improvements
- LOTS more guardrails with respect to validations and error handling
- Simplified logic for file type validation and tag management in scripts
- Enhanced error handling for missing input files across multiple scripts
- Updated argument parser configurations for improved user guidance and error handling
- Streamlined output methods across multiple scripts for consistency
PRs
Full Changelog: 1.10.1...1.11