Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

do not use query_table for variant id queries #4518

Merged
merged 2 commits into from
Dec 6, 2024

Conversation

hanars
Copy link
Collaborator

@hanars hanars commented Dec 5, 2024

When building the VLM I discovered that the approach we were using to look up variants in the hail backend, which is to create a key-table of locus-alleles and then use query_table to just read in the relevant rows, is actually substantially slower than reading in the table directly with the intervals set to the one base pair interval of the variant itself and then filtering for matching alleles. Testing on an GRCh37 lookup the time went from 32 seconds to 16 seconds, and GrCh38 is actually a bit slower the 37

The query_table approach is still needed for SV lookup, as SV tables are not keyed by locus

@hanars hanars merged commit 2e6cf15 into dev Dec 6, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants