Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

truncLenKeep #2020

Open
LukeLikesDirt opened this issue Sep 15, 2024 · 3 comments
Open

truncLenKeep #2020

LukeLikesDirt opened this issue Sep 15, 2024 · 3 comments

Comments

@LukeLikesDirt
Copy link

Hi Benjamin

Could you please consider implementing a truncLenKeep option in future versions, similar to --fastq_trunclen_keep in VSEARCH. This would be beneficial for quality filtering in the ITS pipeline, where one could remove reverse compliment primers to address read-through before filtering, but also truncate poor-quality distal ends for long ITS regions without losing shorter reads.

Have you considered this previously? Am I missing a quality filtering option that could achieve this?

Kind regards
Luke

@benjjneb
Copy link
Owner

Not missing anything, we don't have an option like this implemented.

Can you clarify the use case for ITS? Is it that you would use an external tool (e.g. cutadapt) to remove reverse primers, and then use truncLenKeep on the cutadapted reads to truncate only the longer remaining reads?

@LukeLikesDirt
Copy link
Author

Thanks for your reply.

Yes exactly.

Because the ITS1 or ITS2 regions can be as short as 50 bp in some groups, I generally remove the reverse complement of primers to address read-through. However, some groups have ITS1 or ITS2 regions greater than 300 bp in length. For these cases, particularly when distal ends are noisy, it can be beneficial to truncate forward and reverse reads to a length that allows merging after quality filtering and denoising. I find this approach allows more reads to pass quality filtering and denoising. Currently, I use VSEARCH to quality filter and truncate noisy ends while retaining reads representing short ITS1 or ITS2, before denoising with DADA2. To be fair, this isn’t a major issue, but I would prefer the option to do this directly in DADA2 if it were available.

Does this generally seem like a reasonable approach? Am I missing a similar or better way to achieve my goal in DADA2?

Cheers

Luke

@benjjneb
Copy link
Owner

Linking this to a related issue raised in the GH repo for the QIIME2 DADA2 plugin: qiime2/q2-dada2#129 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants