Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rnaseq_cgl_pipeline.py check that unzip is installed #203

Closed
Jeltje opened this issue Mar 21, 2016 · 16 comments
Closed

rnaseq_cgl_pipeline.py check that unzip is installed #203

Jeltje opened this issue Mar 21, 2016 · 16 comments
Assignees

Comments

@Jeltje
Copy link
Contributor

Jeltje commented Mar 21, 2016

It appears that the code does not check for the unzip program, which leads to a fairly incomprehensible No such file or directory error from mapsplice

@jvivian
Copy link
Collaborator

jvivian commented Mar 21, 2016

This pipeline does not use unzip or mapsplice.

@Jeltje
Copy link
Contributor Author

Jeltje commented Mar 21, 2016

Well, it crashes with that error.

@jvivian
Copy link
Collaborator

jvivian commented Mar 21, 2016

Can you post a stacktrace?

@Jeltje
Copy link
Contributor Author

Jeltje commented Mar 21, 2016

Yeah of course, I'll rerun it. Had to terminate the cluster last night because it turns out you can't stop a spot cluster...

@Jeltje
Copy link
Contributor Author

Jeltje commented Mar 21, 2016

Can't reproduce the error!
However, the pipeline now runs in full with no WARNING or ERROR messages, but the expected output file (SRR1559172.tar.gz) does not appear in s3:cgl-driver-projects/test/rna-test

Log file here

@jvivian
Copy link
Collaborator

jvivian commented Mar 21, 2016

Can you post the contents of the launch script?

@Jeltje
Copy link
Contributor Author

Jeltje commented Mar 21, 2016

export PYTHONPATH=$(python -c 'from os.path import abspath as a, dirname as d;import sys;print d(d(d(a(sys.argv[1]))))' $0)
python -m toil_scripts.rnaseq_unc.rnaseq_unc_pipeline \
aws:us-west-2:unc-pipeline-run-1 \
--retryCount 1 \
--config toil_rnaseq_config.csv \
--ssec "/home/mesosbox/shared/master.key" \
--output_dir "/home/mesosbox/rnaseq_output" \
--s3_dir "cgl-driver-projects/test/rna-test/" \
--sseKey=/home/mesosbox/shared/master.key \
--batchSystem="mesos" \
--mesosMaster mesos-master:5050 \
--workDir=/var/lib/toil \
#--restart

I do realize I should have probably removed the output_dir.

@jvivian
Copy link
Collaborator

jvivian commented Mar 21, 2016

Any chance if you know if the output showed up in the output dir?

@Jeltje
Copy link
Contributor Author

Jeltje commented Mar 21, 2016

I did not create a /home/mesosbox/rnaseq_output and there is none on the master after the run. But also no error about that.

@jvivian
Copy link
Collaborator

jvivian commented Mar 21, 2016

It would've been on the worker. I'm running this pipeline (and the UNC one) as part of a test, so I'll see if i can replicate.

@Jeltje
Copy link
Contributor Author

Jeltje commented Mar 21, 2016

okay I can have a look there

@Jeltje
Copy link
Contributor Author

Jeltje commented Mar 21, 2016

Nope, not on either worker.

@jvivian
Copy link
Collaborator

jvivian commented Mar 21, 2016

Ok, thanks for checking.

@jvivian
Copy link
Collaborator

jvivian commented Mar 21, 2016

Could not reproduce. My run just finished and my file showed up in S3.

s3://cgl-driver-projects/test/test_SRR1559177_Sample_NB2318tumor1_T_C3WFVACXX.tar.gz

#!/usr/bin/env bash
# John Vivian
#
# Please read the associated README.md before attempting to use.
#
# Execution of pipeline
export PYTHONPATH=$(python -c 'from os.path import abspath as a, dirname as d;import sys;print d(d(d(a(sys.argv[1]))))' $0)
python -m toil_scripts.rnaseq_cgl.rnaseq_cgl_pipeline \
aws:us-west-2:jvivian-releases-run-1 \
--config /home/mesosbox/shared/config.txt \
--retryCount 1 \
--ssec /home/mesosbox/shared/master.key \
--s3_dir cgl-driver-projects/test/ \
--sseKey=/home/mesosbox/shared/master.key \
--batchSystem="mesos" \
--mesosMaster mesos-master:5050 \
--workDir=/var/lib/toil \
#--restart

@jvivian jvivian closed this as completed Mar 21, 2016
@hannes-ucsc
Copy link
Contributor

I see unzip mentioned here. What's the relationship between this issue and BD2KGenomics/cgcloud#134? Is the latter invalid? What triggered Jeltje to believe that a missing unzip was the culprit?

@hannes-ucsc hannes-ucsc reopened this Mar 21, 2016
@jvivian
Copy link
Collaborator

jvivian commented Mar 21, 2016

What's the relationship between this issue and BD2KGenomics/cgcloud#134?

Unzip is needed for the UNC rna-seq pipeline, not this one.

What triggered Jeltje to believe that a missing unzip was the culprit?

An error that she wasn't able to reproduce.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants