Skip to content

Latest commit

 

History

History
56 lines (43 loc) · 2.49 KB

README.md

File metadata and controls

56 lines (43 loc) · 2.49 KB

This repository contains the datasets and scripts pertaining to the publication

CatPred: A comprehensive framework for deep learning in vitro enzyme kinetic parameters kcat, Km and Ki

DOI

Citation CatPred biorxiv pre-print:
@article {Boorla2024.03.10.584340,
	author = {Veda Sheersh Boorla and Costas D. Maranas},
	title = {CatPred: A comprehensive framework for deep learning in vitro enzyme kinetic parameters kcat, Km and Ki},
	elocation-id = {2024.03.10.584340},
	year = {2024},
	doi = {10.1101/2024.03.10.584340},
	publisher = {Cold Spring Harbor Laboratory},
	URL = {https://www.biorxiv.org/content/early/2024/03/26/2024.03.10.584340},
	eprint = {https://www.biorxiv.org/content/early/2024/03/26/2024.03.10.584340.full.pdf},
	journal = {bioRxiv}
}
Table of contents

CatPred-DB datasets

Datasets referenced in the publication can be found in these directories

./datasets/
├── processed               		# processed datasets one each for kcat, Km and Ki
├── splits                  		# training/validation/test dataset splits for kcat, Km and Ki
├── all_natural_metabolite_names.json 		# compiled names of natural metabolites from BRENDA
├── metabolite_inchi_smiles_brenda_pubchem.tsv 	# compiled list of inchi and smiles strings for brenda molecules

CatPred-DB scripts

Custom scripts used for processing, splitting and analysis can be found in these directories

./scripts/
├── processing               	# for processing brenda/sabio datasets and merging them
├── pdb                  		# for creating files required for datasets including structural information (pdb)
├── add_pdb_paths.py 		# adds pdb paths to a dataset file
├── count_natural.py 		# counts natural vs. non-natural entries
├── create_pdbrecords.py 		# create pdb records files to go along with datasets required for training
├── make_splits.py 			# create train/valid/test splits from processed datasets

License

This source code is licensed under the MIT license found in the LICENSE file in the root directory of this source tree.