Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError in format_pileup.py #110

Open
dfornika opened this issue Feb 13, 2024 · 3 comments
Open

KeyError in format_pileup.py #110

dfornika opened this issue Feb 13, 2024 · 3 comments

Comments

@dfornika
Copy link
Contributor

We've seen an error in format_pileup.py where a KeyError can be triggered here:

Traceback (most recent call last):
  File "ncov-tools/workflow/rules/../scripts/format_pileup.py", line 23, in <module>
    freqs[b] += 1
KeyError: 'N'

...because the dict isn't initialized with an N key:

freqs = {'A': 0, 'T': 0, 'G': 0, 'C': 0, '-': 0, 'R': 0 }

@dfornika
Copy link
Contributor Author

dfornika commented Feb 13, 2024

The error appears to occur when a sample includes reads that have N bases. We ran into this issue when setting up some test data for a CI pipeline.

https://github.com/BCCDC-PHL/ncov-tools-nf/tree/9d10033fee0fb0fa75d4cc8d8c7ddedc51522e24/.github/data/fastqs

Sample SRR27503680-2-25x from the link above (derived from SRA sample SRR27503680 includes reads like this:

ATAACAACTTCTGTGGCCCTGATGGCTACCCTCTTGAGTGCATTAAAGACCTTCTAGCACGTGCTGGTANNNN

...which trigger the error.

@rdeborja
Copy link
Collaborator

Added N key in dictionary:

freqs = {'A': 0, 'T': 0, 'G': 0, 'C': 0, '-': 0, 'R': 0, 'N': 0 }

See branch fix/metadata-na. If you're happy with the results I'll merge the code and do a release.

@hgibling
Copy link

For what it's worth, the team I work with encountered this issue a while back, made the same change, and was able to run ncov-tools on the problematic sample with this solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants