To install the package run:
poetry install
Here we assume to start from a JSON file reporting synthesis trees characterized by reactions SMILES represented using a pre-order traversal. A sample is provided here.
Additionally we assume a model for reaction fingerprints compatible with rxnfp
is available (see the repo for instructions on how to train your own on public or proprietary data).
To get the default model used in RXN for Chemistry simply clone the repo:
git clone https://github.com/rxn4chemistry/rxnfp.git
You can directly use the default model available at ./rxnfp/rxnfp/models/transformers/bert_ft
.
Prepare the fingerprints from available synthesis trees:
generate-fingerprints --reaction_trees_path "./sample-data/reaction_trees.json" --fingerprints_model_path "./rxnfp/rxnfp/models/transformers/bert_ft" --generated_fingerprints_path "./sandbox/generated_fingerprints.csv"
Prepare the PCA model for fingerprint compression and related indexes:
generate-pca-compression-and-indices --reaction_trees_path "./sample-data/reaction_trees.json" --fingerprints_path "./sandbox/generated_fingerprints.csv" --pca_model_filename "./sandbox/pca.pkl" --tree_data_dict_pca_filename "./sandbox/tree_data_dict_pca.pkl"
NOTE: these examples are creating a sandbox
folder where all outputs are stored.
We assume you have a pair of single-step forward and backward model trained using rxn-onmt-models
(see the repo for a detailed guide on how to train them on public or proprietary data).
run-neb-retrosynthesis --product "NS(=O)(=O)c1nn(-c2ccccn2)cc1Br" \
--forward_model_path "/path/to/forward_model.pt" \
--backward_model_path "/path/to/backward_model.pt" \
--fingerprints_model_path "./rxnfp/rxnfp/models/transformers/bert_ft" \
--pca_model_filename "./sandbox/pca.pkl" \
--tree_data_dict_pca_filename "./sandbox/tree_data_dict_pca.pkl" \
--output_path ./test_retro.json