-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error tolerance of cutadapt should also be a parameter #6
Comments
Hi Clara - one of the graduate students in my lab, Glade, found that one of the keys for him was setting the TruncQ parameter higher than the default. While this may not seem critical, his reasoning was that with ITS seqs there is a long, low quality tail that can cause reads to get rejected later. If you are more aggressive with trimming early that actually saves more down the line. Here are the settings he found worked best:
out <- filterAndTrim(cutFs, filtFs, cutRs, filtRs, maxN = 0, maxEE = c(2, 2),
truncQ = 9, minLen = 50, rm.phix = TRUE, compress = TRUE, multithread = TRUE)
Best,
Kabir
On Mar 26, 2020, at 9:00 PM, Clara Qin <notifications@github.com<mailto:notifications@github.com>> wrote:
Currently, error tolerance is assumed to be default -e 0.1. Changing the error tolerance could allow more primer sequences to be identified (and thus removed) by cutadapt.
See https://cutadapt.readthedocs.io/en/stable/guide.html#error-tolerance
To be fair, this might not be the most consequential parameter. Error tolerance of 0.1 corresponds to maximum mismatch of 2 bases. Allowing up to 3 mismatched bases would have removed an additional 138 reverse-primer matches from the reverse reads – approximately 10% of all the matches (with max.mismatch=3) in the pre-cutadapt reverse reads.
runB69PP, first R1 sample
Forward Complement Reverse RevComp
FWDPrimer.R1.reads 423 0 0 3
FWDPrimer.R2.reads 26 0 0 13
REVPrimer.R1.reads 647 0 0 298
REVPrimer.R2.reads 737 0 0 4
runB69PP, first R1 sample, setting max.mismatch = 2
Forward Complement Reverse RevComp
FWDPrimer.R1.reads 731 0 0 4
FWDPrimer.R2.reads 31 0 0 314
REVPrimer.R1.reads 1518 0 0 807
REVPrimer.R2.reads 1196 0 0 13
runB69PP, first R1 sample, setting max.mismatch = 3
Forward Complement Reverse RevComp
FWDPrimer.R1.reads 757 0 0 4
FWDPrimer.R2.reads 31 0 0 343
REVPrimer.R1.reads 1528 0 0 944
REVPrimer.R2.reads 1342 0 0 19
runB69PP, post cutadapt (1 of 2), first R1 sample
Forward Complement Reverse RevComp
FWDPrimer.R1.reads 0 0 0 3
FWDPrimer.R2.reads 26 0 0 0
REVPrimer.R1.reads 646 0 0 0
REVPrimer.R2.reads 0 0 0 4
runB69PP, post cutadapt (1 of 2), first R1 sample, setting max.mismatch = 2
Forward Complement Reverse RevComp
FWDPrimer.R1.reads 0 0 0 4
FWDPrimer.R2.reads 31 0 0 0
REVPrimer.R1.reads 1509 0 0 0
REVPrimer.R2.reads 0 0 0 13
runB69PP, post cutadapt (1 of 2), first R1 sample, setting max.mismatch = 3
Forward Complement Reverse RevComp
FWDPrimer.R1.reads 1 0 0 4
FWDPrimer.R2.reads 31 0 0 5
REVPrimer.R1.reads 1519 0 0 132
REVPrimer.R2.reads 138 0 0 19
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#6>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AC5DJWWSQ6BBRMX2JX7C6QLRJQQFFANCNFSM4LUW33NQ>.
Kabir Peay
Associate Professor
Dept. of Biology
Stanford University
(650) 723-0552
|
Hi Kabir, If I'm not mistaken, I think you're bringing up a separate issue, so I've just opened up a new issue (#7) and tagged you in it. Best, |
Just a note that while I haven't made the cutadapt error tolerance into another parameter, I did modify I'm actually not sure whether it would be a good idea to make this into a parameter, so I'll leave this issue open until we can reach more of a consensus. |
Currently, error tolerance is assumed to be default
-e 0.1
. Changing the error tolerance could allow more primer sequences to be identified (and thus removed) bycutadapt
.See https://cutadapt.readthedocs.io/en/stable/guide.html#error-tolerance
To be fair, this might not be the most consequential parameter. Error tolerance of 0.1 corresponds to maximum mismatch of 2 bases. Allowing up to 3 mismatched bases would have removed an additional 138 reverse-primer matches from the reverse reads – approximately 10% of all the matches (with
max.mismatch=3
) in the pre-cutadapt reverse reads.The text was updated successfully, but these errors were encountered: