-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
show RNAseq type in sample hover #4171
Conversation
jklugherz
commented
Jun 17, 2024
seqr/views/apis/family_api.py
Outdated
@@ -43,7 +43,20 @@ def family_page_data(request, family_guid): | |||
has_case_review_perm = has_case_review_permissions(project, request.user) | |||
|
|||
sample_models = Sample.objects.filter(individual__family=family) | |||
samples = get_json_for_samples(sample_models, project_guid=project.guid, family_guid=family_guid, skip_nested=True, is_analyst=is_analyst) | |||
additional_values = { | |||
'rnaSeqTpm': Case(When(Exists(RnaSeqTpm.objects.filter(sample_id=OuterRef('pk'))), then=Value('TPM')), default=None), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you should be able to do this using the related field on the sample model without having to write query objects with outer refs - something like When(rnaseqtpm__isnull=False)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally had the __isnull
syntax but the query was timing out with more than 1 rnaseq type. I could add indexes on the foreign keys to the rnaseq tables on the sample model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be indices on the foreign keys already, so that sounds like a case where what django is doing is not what we want. You could try using the debug toolbar to see the actual executed SQL and figure out why its so inefficient and adjust it, but honestly that feels like overkill and its probably better not to try to use those fields in this case
That said, if you are going to make 3 distinct queries to each of the subtypes and then iterate through all the sample models and add annotations to them, using the additional_values
does not really save us any compute and just makes the whole thing less readable in the end. In that case, you are better off with getting lists of sample Ids for each type and then adding to rnaSeqTypes
based on that, something like
rna_type_samples = {
'TPM': set(RnaSeqTpm.objects.filter(sample__in= sample_models).values_list('sample', flat=True),
'Expression Outlier': set(RnaSeqOutlier.objects.filter(...
'Splice Outlier': set(RnaSeqSpliceOutlier.objects.filter(...
}
...
sample['rnaSeqTypes'] = [type for type, sample_ids in rna_type_samples.items() if sample['id'] in sample_ids]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented it this way ^ which does not require changes to the JS code as you suggested here.
Is there a general rule for what belongs in additional_values and when it's better to query related tables separately?
@@ -169,7 +169,7 @@ const DataDetails = React.memo(({ loadedSamples, individual, mmeSubmission }) => | |||
/> | |||
) : <MmeStatusLabel title="Submitted to MME" dateField="lastModifiedDate" color="violet" individual={individual} mmeSubmission={mmeSubmission} /> | |||
)} | |||
{individual.hasRnaOutlierData && ( | |||
{loadedSamples.some(sample => sample.rnaSeqTypes?.length > 0) && ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hasRnaOutlierData
was based on whether there was rnaseqoutlier
or rnaseqspliceoutlier
data, for samples with only tpm data this would have been false, so we should update the behavior here to reflect that. Also it only considered active samples
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I adjusted this condition check to match the behavior to show the link only if the sample is active and has either rnaseqoutlier or rnaseqspliceoutlier data
`data was${isOutdated ? ' previously ' : ''} ${hoverDetails ? `${hoverDetails} on ${new Date(loadedSample.loadedDate).toLocaleDateString()}` : 'loaded'}` : | ||
'no data available'} | ||
<VerticalSpacer height={5} /> | ||
{loadedSample.sampleType && loadedSample.sampleType === SAMPLE_TYPE_RNA && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rnaSeqTypes
will only have values for RNA samples so you can remove this part of the condition check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this part of the check some old(?) rna samples which do not have any matches in the 3 rnaseqtype tables will display like this:
There are 76 of these samples all created on 2016-12-05
WITH sample_rnaseq as (
select s.guid, s.id, s.created_date,
seqr_rnaseqtpm.id as tpm_id, seqr_rnaseqoutlier.id as outlier_id, sr.id as splice_outlier_id
from seqr_sample s
left join seqr_rnaseqtpm on s.id = seqr_rnaseqtpm.sample_id
left join seqr_rnaseqoutlier on s.id = seqr_rnaseqoutlier.sample_id
left join public.seqr_rnaseqspliceoutlier sr on s.id = sr.sample_id
where s.sample_type in ('RNA')
)
SELECT * FROM sample_rnaseq
WHERE tpm_id is null
and outlier_id is null
and splice_outlier_id is null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they're all probably from the R0246_rna_seq_bc project, I've been using that for testing since I was certain it would have rna samples given the name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh we should only be showing this for active samples, not inactive sample