Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get error in MSA process and find StopIteration error #244

Open
233xzl opened this issue Jan 7, 2025 · 2 comments
Open

Get error in MSA process and find StopIteration error #244

233xzl opened this issue Jan 7, 2025 · 2 comments
Labels
bug Something isn't working

Comments

@233xzl
Copy link

233xzl commented Jan 7, 2025

Hi, when i`m trying to predict structure of protein sequence YPGKRDEYTR, it stoped after Hmmbuild and i dont know why.

Processing chain A
I0107 10:48:16.433073 140348520649856 pipeline.py:40] Getting protein MSAs for sequence YPGKRDEYTR
I0107 10:48:16.442079 140342601832000 jackhmmer.py:78] Query sequence: YPGKRDEYTR
I0107 10:48:16.442653 140342601832000 subprocess_utils.py:68] Launching subprocess "/hmmer/bin/jackhmmer -o /dev/null -A /tmp/tmpxshlimqm/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --cpu 8 -N 1 -E 0.0001 --incE 0.0001 /tmp/tmpxshlimqm/query.fasta /root/public_databases/uniref90_2022_05.fa"
I0107 10:48:16.443206 140342593439296 jackhmmer.py:78] Query sequence: YPGKRDEYTR
I0107 10:48:16.443528 140342593439296 subprocess_utils.py:68] Launching subprocess "/hmmer/bin/jackhmmer -o /dev/null -A /tmp/tmpomhb2cdy/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --cpu 8 -N 1 -E 0.0001 --incE 0.0001 /tmp/tmpomhb2cdy/query.fasta /root/public_databases/mgy_clusters_2022_05.fa"
I0107 10:48:16.443967 140342585046592 jackhmmer.py:78] Query sequence: YPGKRDEYTR
I0107 10:48:16.444274 140342585046592 subprocess_utils.py:68] Launching subprocess "/hmmer/bin/jackhmmer -o /dev/null -A /tmp/tmpbljt598r/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --cpu 8 -N 1 -E 0.0001 --incE 0.0001 /tmp/tmpbljt598r/query.fasta /root/public_databases/bfd-first_non_consensus_sequences.fasta"
I0107 10:48:16.444640 140342576653888 jackhmmer.py:78] Query sequence: YPGKRDEYTR
I0107 10:48:16.444914 140342576653888 subprocess_utils.py:68] Launching subprocess "/hmmer/bin/jackhmmer -o /dev/null -A /tmp/tmpixho4k20/output.sto --noali --F1 0.0005 --F2 5e-05 --F3 5e-07 --cpu 8 -N 1 -E 0.0001 --incE 0.0001 /tmp/tmpixho4k20/query.fasta /root/public_databases/uniprot_all_2021_04.fa"
I0107 10:49:25.173050 140342585046592 subprocess_utils.py:97] Finished Jackhmmer in 68.729 seconds
I0107 10:53:01.553389 140342601832000 subprocess_utils.py:97] Finished Jackhmmer in 285.111 seconds
I0107 10:55:09.169230 140342576653888 subprocess_utils.py:97] Finished Jackhmmer in 412.724 seconds
I0107 10:55:53.378418 140342593439296 subprocess_utils.py:97] Finished Jackhmmer in 456.935 seconds
I0107 10:55:53.379953 140348520649856 pipeline.py:73] Getting protein MSAs took 456.95 seconds for sequence YPGKRDEYTR
I0107 10:55:53.380220 140348520649856 pipeline.py:79] Deduplicating MSAs and getting protein templates for sequence YPGKRDEYTR
I0107 10:55:53.382555 140342593439296 subprocess_utils.py:68] Launching subprocess "/hmmer/bin/hmmbuild --informat stockholm --hand --amino /tmp/tmp8kivu13d/output.hmm /tmp/tmp8kivu13d/query.msa"
I0107 10:55:53.392291 140342593439296 subprocess_utils.py:97] Finished Hmmbuild in 0.010 seconds
I0107 10:55:53.392671 140342593439296 subprocess_utils.py:68] Launching subprocess "/hmmer/bin/hmmsearch --noali --cpu 8 --F1 0.1 --F2 0.1 --F3 0.1 -E 100 --incE 100 --domE 100 --incdomE 100 -A /tmp/tmphv4lr89t/output.sto /tmp/tmphv4lr89t/query.hmm /root/public_databases/pdb_seqres_2022_09_28.fasta"
I0107 10:55:54.745479 140342593439296 subprocess_utils.py:97] Finished Hmmsearch in 1.353 seconds
Traceback (most recent call last):
  File "/app/alphafold/run_alphafold.py", line 699, in <module>
    app.run(main)
  File "/alphafold3_venv/lib/python3.11/site-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/alphafold3_venv/lib/python3.11/site-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
             ^^^^^^^^^^
  File "/app/alphafold/run_alphafold.py", line 684, in main
    process_fold_input(
  File "/app/alphafold/run_alphafold.py", line 543, in process_fold_input
    fold_input = pipeline.DataPipeline(data_pipeline_config).process(fold_input)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/alphafold3_venv/lib/python3.11/site-packages/alphafold3/data/pipeline.py", line 462, in process
    processed_chains.append(self.process_protein_chain(chain))
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/alphafold3_venv/lib/python3.11/site-packages/alphafold3/data/pipeline.py", line 411, in process_protein_chain
    unpaired_msa, paired_msa, template_hits = _get_protein_msa_and_templates(
                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/alphafold3_venv/lib/python3.11/site-packages/alphafold3/data/pipeline.py", line 107, in _get_protein_msa_and_templates
    protein_templates = templates_future.result()
                        ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/alphafold3_venv/lib/python3.11/site-packages/alphafold3/data/templates.py", line 372, in from_seq_and_a3m
    hmmsearch_a3m = run_hmmsearch_with_a3m(
                    ^^^^^^^^^^^^^^^^^^^^^^^
  File "/alphafold3_venv/lib/python3.11/site-packages/alphafold3/data/templates.py", line 888, in run_hmmsearch_with_a3m
    return searcher.query_with_sto(sto, model_construction='hand')
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/alphafold3_venv/lib/python3.11/site-packages/alphafold3/data/tools/hmmsearch.py", line 149, in query_with_sto
    return self.query_with_hmm(hmm)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/alphafold3_venv/lib/python3.11/site-packages/alphafold3/data/tools/hmmsearch.py", line 128, in query_with_hmm
    a3m_out = parsers.convert_stockholm_to_a3m(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/alphafold3_venv/lib/python3.11/site-packages/alphafold3/data/parsers.py", line 154, in convert_stockholm_to_a3m
    query_sequence = next(iter(sequences.values()))
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
StopIteration
@Augustin-Zidek
Copy link
Collaborator

Could you post the full input JSON? Could you also update to the latest commit (in particular f22873d should make the log messages a bit more verbose and helpful).

@Augustin-Zidek Augustin-Zidek added the bug Something isn't working label Jan 7, 2025
@zhangyumeng1sjtu
Copy link

I met a similar problem when running MSA of a short peptide. I guess the variable sequences (File "alphafold3/data/parsers.py" line 154) was an empty dictionary here because no sequence was in the MSA output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants