Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sylph/profile module #7118

Open
wants to merge 28 commits into
base: master
Choose a base branch
from
Open

Conversation

sofstam
Copy link
Contributor

@sofstam sofstam commented Nov 28, 2024

PR checklist

Moves the local module sylph/profile to nf-core modules. Closes nf-core/seqinspector#65
Removes unnecessary pattern from sylph/sketch meta file.

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • If necessary, include test data in your PR.
  • Remove all TODO statements.
  • Emit the versions.yml file.
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Use BioConda and BioContainers if possible to fulfil software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • For modules:
      • nf-core modules test <MODULE> --profile docker
      • nf-core modules test <MODULE> --profile singularity
      • nf-core modules test <MODULE> --profile conda
    • For subworkflows:
      • nf-core subworkflows test <SUBWORKFLOW> --profile docker
      • nf-core subworkflows test <SUBWORKFLOW> --profile singularity
      • nf-core subworkflows test <SUBWORKFLOW> --profile conda

Copy link
Contributor

@famosab famosab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested to be a bit more specific about the created csv :)

"versions.yml:md5,7b5a545483277cc0ff9189f8891e737f"
],
[
false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it expected that the contains check always leads to the string not being contained?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The string should be contained. Looking into it.

modules/nf-core/sylph/profile/tests/main.nf.test Outdated Show resolved Hide resolved
{ assert process.success },
{ assert snapshot(
process.out.versions,
process.out.profile_out.collect { file(it[1]).readLines().contains("complete genome") },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MAybe you can assert the file a bit better using: https://github.com/lukfor/nft-csv

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean to count the columns of the output file?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anything that makes sense - counting columns or asserting column header both seems fine to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated this now.

modules/nf-core/sylph/sketch/meta.yml Show resolved Hide resolved
modules/nf-core/sylph/profile/main.nf Outdated Show resolved Hide resolved
modules/nf-core/sylph/profile/main.nf Outdated Show resolved Hide resolved
modules/nf-core/sylph/profile/main.nf Outdated Show resolved Hide resolved
@SPPearce
Copy link
Contributor

SPPearce commented Jan 8, 2025

@sofstam , do you need a hand to get this finished off?

@sofstam
Copy link
Contributor Author

sofstam commented Jan 8, 2025

@SPPearce Was on holidays, looking at the comments now :)

@sofstam sofstam requested a review from famosab January 16, 2025 15:30
Comment on lines 27 to 30
{ assert snapshot(process.out.versions).match("versions_single") },
{ assert output_content.size() > 1 }, // Ensure there's at least a header and one data line
{ assert output_content[0] == "Sample_file\tGenome_file\tTaxonomic_abundance\tSequence_abundance\tAdjusted_ANI\tEff_cov\tANI_5-95_percentile\tEff_lambda\tLambda_5-95_percentile\tMedian_cov\tMean_cov_geq1\tContainment_ind\tNaive_ANI\tkmers_reassigned\tContig_name" },
{ assert snapshot(output_content.take(5).join("\n")).match("profile_out_content_single") } // Snapshot first 5 lines
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can probably just do this. You don't need to name the snapshots if you only have one per test.

Suggested change
{ assert snapshot(process.out.versions).match("versions_single") },
{ assert output_content.size() > 1 }, // Ensure there's at least a header and one data line
{ assert output_content[0] == "Sample_file\tGenome_file\tTaxonomic_abundance\tSequence_abundance\tAdjusted_ANI\tEff_cov\tANI_5-95_percentile\tEff_lambda\tLambda_5-95_percentile\tMedian_cov\tMean_cov_geq1\tContainment_ind\tNaive_ANI\tkmers_reassigned\tContig_name" },
{ assert snapshot(output_content.take(5).join("\n")).match("profile_out_content_single") } // Snapshot first 5 lines
{ assert snapshot(
process.out.versions,
output_content.take(5).join("\n")
).match() },
{ assert output_content.size() > 1 }, // Ensure there's at least a header and one data line
{ assert output_content[0] == "Sample_file\tGenome_file\tTaxonomic_abundance\tSequence_abundance\tAdjusted_ANI\tEff_cov\tANI_5-95_percentile\tEff_lambda\tLambda_5-95_percentile\tMedian_cov\tMean_cov_geq1\tContainment_ind\tNaive_ANI\tkmers_reassigned\tContig_name" }

Or you could probably do:

Suggested change
{ assert snapshot(process.out.versions).match("versions_single") },
{ assert output_content.size() > 1 }, // Ensure there's at least a header and one data line
{ assert output_content[0] == "Sample_file\tGenome_file\tTaxonomic_abundance\tSequence_abundance\tAdjusted_ANI\tEff_cov\tANI_5-95_percentile\tEff_lambda\tLambda_5-95_percentile\tMedian_cov\tMean_cov_geq1\tContainment_ind\tNaive_ANI\tkmers_reassigned\tContig_name" },
{ assert snapshot(output_content.take(5).join("\n")).match("profile_out_content_single") } // Snapshot first 5 lines
{ assert snapshot(
process.out.versions,
file(output_content).readLines()[0..4]
).match() }

process {
"""
input[0] = [ [ id:'test', single_end:true ], // meta map
[ file(params.test_data['sarscov2']['illumina']['test_1_fastq_gz'], checkIfExists: true) ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you swap these to the more recent file paths please

modules/nf-core/sylph/profile/tests/main.nf.test Outdated Show resolved Hide resolved
@SPPearce
Copy link
Contributor

One of my comments may have been related to some previous code, can't see the comment on my phone

modules/nf-core/sylph/profile/main.nf Outdated Show resolved Hide resolved
- - pre_sketched_files:
type: file
description: Pre-sketched *.syldb/*.sylsp files. Raw single-end fastq/fasta are allowed and will be automatically sketched to .sylsp/.syldb.
pattern: "*.{syldb,sylsp}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update this pattern too.

description: |
List of input FastQ/FASTA files of size 1 and 2 for single-end and paired-end data,
respectively. They are automatically sketched to .sylsp/.syldb
- - pre_sketched_files:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest making this variable name more general given it could be fasta. Something like database ( since the manual uses the same term too ).

assertAll(
{ assert process.success },
{ assert snapshot(process.out.versions).match("versions_single") },
{ assert output_content.size() > 1 }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there not a specific file or contents to test for here?

sofstam and others added 3 commits January 17, 2025 10:06
Co-authored-by: Simon Pearce <24893913+SPPearce@users.noreply.github.com>
Co-authored-by: Mahesh Binzer-Panchal <mahesh.binzer-panchal@nbis.se>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants