Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibility to enforce cis/trans isomer using a user-defined CCD input? #237

Open
kls93 opened this issue Dec 27, 2024 · 6 comments
Open
Labels
question Further information is requested

Comments

@kls93
Copy link

kls93 commented Dec 27, 2024

Hello,

I am currently running AlphaFold3 to predict the structure of a protein containing a non-canonical amino acid that can occupy two different isomer states (cis and trans).

The predictions look very good, however regardless of whether I provide coordinates for the cis or the trans isomer in my input CCD, alphafold3 always predicts the trans isomer. Based on the first response to this previous issue (#212), I made the following edits to the get_reference function in features.py (line 1502 onwards).

  ccd_cif = ccd.get(res_name)

  mol = None
  if res_name in ["YYT", "YYC"]:  # The 3-letter id codes I specify for the trans and cis isomers in my input ccd files
    mol = None
  elif ccd_cif:
    try:
      mol = rdkit_utils.mol_from_ccd_cif(ccd_cif, remove_hydrogens=False)
    except rdkit_utils.MolFromMmcifError:
      logging.warning('Failed to construct mol from ccd_cif for: %s', res_name)

I was hoping that this would ensure AF3 uses the reference coordinates in the CCD file string I define rather than using RDkit to build the molecule. However, although based on the printed output AF3 is now using the reference coordinates I have provided,the output predictions still predict the trans isomer.

I was therefore wondering if it is possible to instruct AF3 to predict the cis isomer of a small molecule, or whether it will always predict the more stable isomer state.

Thanks very much for your help!

@Augustin-Zidek Augustin-Zidek added the question Further information is requested label Dec 27, 2024
@joshabramson
Copy link
Collaborator

Hi, one thing to check is that you are putting the coordinates in pdbx_model_Cartn_x_ideal (and y/z) rather than model_Cartn_x - if you are already doing that then I'm afraid it seems the model is not paying attention to the new coordinates. You can try running more seeds to see if the model produces the desired variation sometimes, but that may not work either. Alternatively, try running from SMILES directly, with different SMILES for each case (should give the inputs new unique names when doing this).

@kls93
Copy link
Author

kls93 commented Jan 8, 2025

Hi Josh, thanks for your reply! I tried these options and did not succeed in getting the model to predict the cis isomer of my side chain (although I was unable to use a SMILES string as the ligand is covalently bonded to my protein). Thanks for your suggestions though, they were very helpful - I suspected it might not be possible to get AF3 to use the new coordinates, but it's really useful to have that confirmed as a possibility by the authors.

@joshabramson
Copy link
Collaborator

Ah, if it is covalently bonded one can define the combined residue+ligand as a single custom input non-standard residue (ptm). It can be defined by a cif or smiles - worth trying.

@kls93
Copy link
Author

kls93 commented Jan 8, 2025

Would you mind providing an example of how I would specify the non-standard residue as a SMILES string in the input json? I can't work out where in input json I should specify the SMILES string for a ptm. Thank you!

@Augustin-Zidek
Copy link
Collaborator

Hello,

the following JSON illustrates the concept -- you will have to replace the sequence, pTM position and the user CCD with your data:

{
  "name": "Protein with custom pTM",
  "modelSeeds": [1],
  "sequences": [
    {
      "protein": {
        "id": "A",
        "sequence": "...",
        "modifications": [
          {
            "ptmType": "YYT",  # Or YYC.
            "ptmPosition": <position of your pTM residue>
          },
        ]
      }
    }
  ],
  "userCCD": "data_YYT\n...",  # Fill this in with the exact data for YYT/YYC.
  "dialect": "alphafold3",
  "version": 1
}

Please refer to https://github.com/google-deepmind/alphafold3/blob/main/docs/input.md#user-provided-ccd which documents how to provide user CCD.

@kls93
Copy link
Author

kls93 commented Jan 9, 2025

Apologies, I'm still uncertain how to provide the modified residue as a SMILES string in the "userCCD" entry - when I try I get the following error, so I'm not sure how to get it to switch from expecting a cif format description of my modified residue to a SMILES string:

ValueError: User-defined CCD is missing these keys: {'_chem_comp_atom.pdbx_model_Cartn_y_ideal', 
'_chem_comp_atom.comp_id', '_chem_comp.formula', '_chem_comp.formula_weight', '_chem_comp_atom.type_symbol', 
'_chem_comp_bond.pdbx_aromatic_flag', '_chem_comp_atom.pdbx_model_Cartn_z_ideal', '_chem_comp_atom.charge', 
'_chem_comp_atom.atom_id', '_chem_comp_bond.atom_id_2', '_chem_comp.mon_nstd_parent_comp_id', 
'_chem_comp.type', '_chem_comp_bond.value_order', '_chem_comp_atom.pdbx_leaving_atom_flag', 
'_chem_comp.pdbx_synonyms', '_chem_comp.id', '_chem_comp_atom.pdbx_model_Cartn_x_ideal', 
'_chem_comp_bond.atom_id_1', '_chem_comp.name'}

Thanks very much for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants