-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nbases argument ignored? #2054
Comments
|
Ok, I see. But wouldn't it be desirable if it did? Break samples up, that is, and actually apply the limit it sets out to apply. What's the recommended way of dealing with really large samples, otherwise? Seems like you would have to split the fastq files manually, just for the sake of error rate learning? |
In some cases yes, in others no. If single samples are larger than the |
Ok. Would be good to have a limit yes :) I'm curious though why it's not desirable when multiple samples are read in? Does the error learning algorithm take which sample each read belongs to into account? I thought it pooled all the reads it used to learn the error rates. |
The samples themselves are not pooled. The error rates learned from each sample (individually) are averaged across samples, but only in that sense is there "pooling" by default in |
The nbases argument in learnErrors is set to 1e8 by default, i.e. 100 million = 100 000 000 bases to use for error rate learning. Still, I often see messages like this:
The nbases cutoff never seems to be applied, it's always some odd number of bases being used, way over the nbases limit. How is this actually supposed to work?
The text was updated successfully, but these errors were encountered: