The ATOM Modeling PipeLine (AMPL; https://github.com/ATOMconsortium/AMPL) is an open-source, modular, extensible software pipeline for building and sharing models to advance in silico drug discovery.
This repository contains a collection of experimental AMPL-COLAB tutorial notebooks.
- Tutorial-00: Basic COLAB tutorial. For all the COLAB tutorials, click on the tutorial link, and then click on "Open in Colab" baner. You can open and run the notebook from the browser. If you want to save your edits to the notebook, you need to save a copy in your Google Drive. Usually, Google COLAB saves the notebook files under the "My Drive > Colab Notebooks" folder
The data that we collect for modeling is small-molecule/drug binding data. The following links will introduce some of the concepts and outcome measures related to this topic:
- https://en.wikipedia.org/wiki/IC50
- https://bpspubs.onlinelibrary.wiley.com/doi/pdfdirect/10.1111/j.1476-5381.2009.00604.x
-
Tutorial-01: (Mode: AMPL_GPU; Time: ~ 4 minutes) This COLAB notebook will use AMPL for data curation, EDA and clustering on ExCAPE-DB (https://solr.ideaconsult.net/search/excape/) data for HTR3A protein (modified from Dr. Jonathan Allen's notebook)
-
Tutorial-02: (Mode: AMPL_GPU; Time: ~ 4 minutes) This COLAB notebook will use AMPL for Data curation of HTR3A protein data from ExCAPE-DB (https://solr.ideaconsult.net/search/excape/) Data (modified from Dr. Jonathan Allen's notebook)
- Tutorial-03: (Mode: AMPL_GPU; Time: ~ 4 minutes) This COLAB notebook will use AMPL for Data curation of HTR3A protein data from DTC (https://solr.ideaconsult.net/search/excape/) Data (modified from Dr. Jonathan Allen's notebook)
- Tutorial-04: (Mode: AMPL_GPU; Time: ~ 4 minutes) This COLAB notebook will use AMPL to upload datasets (small-molecule activity data from ChEMBL), clean, merge and do some basic Exploratory Data Analysis.
- Tutorial-05: (Time: ~ 2 minutes): Simple supervised learning example.
AMPL will read the public data (117 chemical compounds), curate, fit a Random Forest model to predict solubility and test the model. For additional information on the dataset, please check this publication,https://pubmed.ncbi.nlm.nih.gov/15154768/
- Tutorial-06: (Mode: AMPL_GPU; Time: ~ 2 minutes): This AMBL-COLAB notebook uses example Tutorial-01 except AMPL in GPU mode (AMPL_GPU)
- Tutorial-07: (Mode: AMPL_GPU; Time: ~ 18 minutes):
This COLAB notebook will use AMPL for predicting binding affinities -pIC50 values- of ligands that could bind to human Sodium channel protein type 5 subunit alpha protein (Gene: SCN5A) using Graph Convolutional Network Model. ChEMBL database is the data source of binding affinities (pIC50)
6. Exploring AMPL functions for saving models and loading prebuild models for prediction (coming soon)
This notebook loads a model from a published work, https://arxiv.org/abs/2002.12541, and makes an inference with an example dataset, https://github.com/ravichas/AMPL-Tutorial/blob/master/BSEP_modeling.ipynb)
- DeepChem, https://deepchem.io/
- rdkit, https://www.rdkit.org/
- ChEMBL: https://www.ebi.ac.uk/chembl/
- PubChem: https://pubchem.ncbi.nlm.nih.gov/
- Drug Target Commons (DTC): https://drugtargetcommons.fimm.fi/
- ExCAPE-DB: https://solr.ideaconsult.net/search/excape/
- DrugBank: https://go.drugbank.com/
- Amanda Paulson
- Ben Madej
- Da Shi
- Hiran Ranganathan
- Jonathan Allen
- Kevin Mcloughlin
- Ya Ju Fan