APBIO is a tool developed to predict compound-target interactions, with a specific focus on air pollutants and their bioactivity. It computes bioactivity signatures for compounds starting from the SMILES representation (e.g., C1=CC=CC=C1) and FASTA sequence features for targets starting from the UniProtKB identifier (e.g., Q9UHW9).
Bioactivity signatures are computed via the signaturizer package. For further details please visit the Chemical Checker (CC) paper, the CC signaturizers paper, and the relative repositories.
Sequence descriptors are calculated via the iFeature toolkit. Specifically, we use the main iFeature.py program and the required files in the codes and data folders. Additional information is provided in the iFeature paper and the relative repository.
To run notebooks and reproduce results, you can clone this repo and set up a conda environment using the code snippet below:
$ conda create --no-default-packages -n cti -y python=3.7.16
$ conda activate cti
$ pip install -r requirements.txt
The main methodology can be executed via the APBIO_pipeline.ipynb notebook.
The datasets and additional materials related to this work can be found
here.
If you want to perform the sampling strategy evaluation, please download and place
the sampled and random folders in the following path: /cti_datasets/AP_CTIs/
.
The Streamlit web app is available at: https://ap-bio.streamlit.app/.