Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong CDS annotation #28

Open
agsanbel opened this issue Sep 22, 2022 · 6 comments
Open

Wrong CDS annotation #28

agsanbel opened this issue Sep 22, 2022 · 6 comments

Comments

@agsanbel
Copy link

Hello,

I have ran:

ribotish predict -t TIS_bams_new/R3_ProAligned.sortedByCoord.out.bam -b CHX_bams_new/R3_ProAligned.sortedByCoord.out.bam -g gencode.v19.annotation.gtf -f GRCh37.p13.genome.fa -o pred.txt --framebest

The result is that (It is not finished):

No offset parameter file found for CHX_bams_new/R3_ProAligned.sortedByCoord.out.bam. Using default offset (12).
Thu Sep 22 10:43:03 2022 Estimating TIS background parameters...
Thu Sep 22 12:05:04 2022 Predicting...
Wrong CDS annotation: ENSG00000189409.8 ENST00000472264.1 56 556 556
Wrong CDS annotation: ENSG00000116649.5 ENST00000487300.1 252 611 611
Wrong CDS annotation: ENSG00000219073.3 ENST00000374666.1 3 497 497
Wrong CDS annotation: ENSG00000158062.16 ENST00000374215.1 196 938 938
Wrong CDS annotation: ENSG00000090020.6 ENST00000374084.2 289 695 695
Wrong CDS annotation: ENSG00000142765.13 ENST00000473280.1 83 312 312
Wrong CDS annotation: ENSG00000116497.13 ENST00000482212.1 267 706 706
Wrong CDS annotation: ENSG00000116922.10 ENST00000486637.1 511 852 852
Wrong CDS annotation: ENSG00000066136.15 ENST00000531464.1 344 525 525
Wrong CDS annotation: ENSG00000117385.11 ENST00000372526.2 31 654 654
Wrong CDS annotation: ENSG00000186973.6 ENST00000409396.1 29 448 448
Wrong CDS annotation: ENSG00000079277.15 ENST00000496619.1 202 738 738
Wrong CDS annotation: ENSG00000132122.7 ENST00000371841.1 69 818 818
Wrong CDS annotation: ENSG00000085831.11 ENST00000411642.2 92 931 931
Wrong CDS annotation: ENSG00000134744.9 ENST00000484723.2 191 2910 2910
Wrong CDS annotation: ENSG00000116212.10 ENST00000371368.1 261 838 838
Wrong CDS annotation: ENSG00000125703.10 ENST00000371118.1 96 665 665
Wrong CDS annotation: ENSG00000177414.9 ENST00000371077.5 424 1193 1193
Wrong CDS annotation: ENSG00000116791.9 ENST00000370870.1 158 888 888
Wrong CDS annotation: ENSG00000137944.12 ENST00000370486.1 232 1019 1019
Wrong CDS annotation: ENSG00000184371.9 ENST00000357302.4 239 597 597
Wrong CDS annotation: ENSG00000007341.14 ENST00000369664.1 174 862 862
Wrong CDS annotation: ENSG00000134262.8 ENST00000369564.1 336 1225 1225
Wrong CDS annotation: ENSG00000163349.17 ENST00000503968.1 251 570 570
Wrong CDS annotation: ENSG00000163399.11 ENST00000369494.1 264 698 698
Wrong CDS annotation: ENSG00000143452.11 ENST00000368987.1 219 791 791
Wrong CDS annotation: ENSG00000143442.17 ENST00000533351.1 167 586 586
Wrong CDS annotation: ENSG00000143569.14 ENST00000368504.1 69 1120 1120
Wrong CDS annotation: ENSG00000132676.11 ENST00000471214.1 388 1287 1287
Wrong CDS annotation: ENSG00000143320.4 ENST00000368220.1 208 456 456
Wrong CDS annotation: ENSG00000249730.1 ENST00000504970.1 0 935 935
Wrong CDS annotation: ENSG00000116191.13 ENST00000324778.5 107 906 906
Wrong CDS annotation: ENSG00000162779.16 ENST00000509175.1 310 765 765
Wrong CDS annotation: ENSG00000135837.11 ENST00000357434.2 0 392 392
Wrong CDS annotation: ENSG00000116747.8 ENST00000506303.1 488 613 613
Wrong CDS annotation: ENSG00000081237.14 ENST00000367379.1 71 517 517
Wrong CDS annotation: ENSG00000117153.11 ENST00000367258.1 87 1033 1033
Wrong CDS annotation: ENSG00000143842.10 ENST00000525442.1 365 538 538
Wrong CDS annotation: ENSG00000117625.9 ENST00000533469.1 91 540 540
Wrong CDS annotation: ENSG00000162931.7 ENST00000479800.1 123 925 925
Wrong CDS annotation: ENSG00000086619.9 ENST00000366589.1 390 605 605
Wrong CDS annotation: ENSG00000116977.14 ENST00000481485.1 480 799 799
Wrong CDS annotation: ENSG00000172059.6 ENST00000401510.1 232 614 614
Wrong CDS annotation: ENSG00000138074.10 ENST00000401463.1 290 732 732
Wrong CDS annotation: ENSG00000189350.8 ENST00000401723.1 233 720 720
Wrong CDS annotation: ENSG00000055332.12 ENST00000390013.3 257 564 564
Wrong CDS annotation: ENSG00000115828.11 ENST00000404976.1 138 992 992
Wrong CDS annotation: ENSG00000162994.11 ENST00000403506.1 290 430 430
....

I don't know why it takes so long and why that message. I think which is due to annotation file, but then what file I have to use?

thank you so much!!

@zhpn1024
Copy link
Owner

Quality control step is not performed.
The TIS background estimation takes some aditional time. You can use multiprocess parameters to speed up, and use '-v' option to see progress.
The annotation file have some incomplete annotations. You can use newer versions, or just neglect these transcripts.

@agsanbel
Copy link
Author

The problem was that I hadn't the quality control files in the same path, thank you very much!

Now I am trying to use multiprocess (-p) but I have this error:

'AssertionError: group argument must be None for now'

When I use single process that works but it take so long.

Thank you so much!

@zhpn1024
Copy link
Owner

Please provide the details of this error.

@agsanbel
Copy link
Author

ribotish predict -t TIS_bams_new/R3_ProAligned.sortedByCoord.out.bam -b CHX_bams_new/R3_ProAligned.sortedByCoord.out.bam -g gencode.v19.annotation.gtf -f GRCh37.p13.genome.fa -o pred.txt -p 4 -v

Fri Sep 23 10:14:15 2022 Loading genome...
Fri Sep 23 10:14:15 2022 Estimating TIS background parameters...
TIS background estimation result will be saved to tisBackground.txt
Traceback (most recent call last):
File "/Users/asanchezb/miniconda3/envs/ribotish/bin/ribotish", line 56, in
main()
File "/Users/asanchezb/miniconda3/envs/ribotish/bin/ribotish", line 34, in main
commands[cmd].run(args)
File "/Users/asanchezb/miniconda3/envs/ribotish/lib/python3.10/site-packages/ribotish/run/predict.py", line 154, in run
pool = MyPool(1) # This is for memory efficiency
File "/Users/asanchezb/miniconda3/envs/ribotish/lib/python3.10/multiprocessing/pool.py", line 215, in init
self._repopulate_pool()
File "/Users/asanchezb/miniconda3/envs/ribotish/lib/python3.10/multiprocessing/pool.py", line 306, in _repopulate_pool
return self._repopulate_pool_static(self._ctx, self.Process,
File "/Users/asanchezb/miniconda3/envs/ribotish/lib/python3.10/multiprocessing/pool.py", line 322, in _repopulate_pool_static
w = Process(ctx, target=worker,
File "/Users/asanchezb/miniconda3/envs/ribotish/lib/python3.10/multiprocessing/process.py", line 82, in init
assert group is None, 'group argument must be None for now'
AssertionError: group argument must be None for now

@zhpn1024
Copy link
Owner

Thank you. I think this error is related to the new version of multiprocessing. The code is different in my python-3.7.4 version.

@agsanbel
Copy link
Author

Ok! I will try to change python version, thank you so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants