ML_QSAR_Model

This repository, ML_QSAR_model, contains Python scripts for developing machine learning-based QSAR (Quantitative Structure-Activity Relationship) classification models.

Scripts Overview

1a_generate_ecfp.py
This script generates Extended-Connectivity Fingerprints (ECFP4) with a diameter of 4 bonds using RDKit, an open-source cheminformatics toolkit.
1b_generate_ecfp.py
A memory-efficient version of 1a_generate_ecfp.py that can handle multiple SDF files.
2_splitting_with_structural_similarity.py
This script uses RDKit for splitting the dataset, ensuring stratified sampling based on both structural features and activity labels.
3_rf_hyperparameter_tuning.py
Utilizes GridSearchCV from Scikit-learn to systematically evaluate a predefined grid of hyperparameter values for Random Forest classifiers. The optimal hyperparameter combination is identified through cross-validation.
4_svm_hyperparameter_tuning.py
Similar to the previous script, but for Support Vector Machine (SVM) classifiers. It also uses GridSearchCV to find the best hyperparameters.
5_model.py
This script runs the model using optimized and default parameters, plots the average ROC AUC over cross-validation iterations, computes various evaluation metrics, and organizes the results into directories.

Getting Started

To run these scripts, ensure that:

The required dependencies, including RDKit and Scikit-learn, are installed.
The correct data is being read from the appropriate directories.

Properly organizing your data and verifying the input paths will ensure the scripts run smoothly.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.ipynb_checkpoints		.ipynb_checkpoints
1a_generate_ecfp.py		1a_generate_ecfp.py
1b_generate_ecfp_memory_efficient.py		1b_generate_ecfp_memory_efficient.py
2_splitting_with_structural_similarity.py		2_splitting_with_structural_similarity.py
3_rf_hyperparameter_tuning.py		3_rf_hyperparameter_tuning.py
4_svm_hyperparameter_tuning.py		4_svm_hyperparameter_tuning.py
5_model.py		5_model.py
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML_QSAR_Model

Scripts Overview

Getting Started

About

Releases

Packages

Languages

License

shinoxide/ML_QSAR_model

Folders and files

Latest commit

History

Repository files navigation

ML_QSAR_Model

Scripts Overview

Getting Started

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages