Skip to content

A Benchmarking Platform for Realistic And Practical Inverse Molecular Design

Notifications You must be signed in to change notification settings

aspuru-guzik-group/Tartarus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tartarus: The Next Generation of Benchmarks for Inverse Molecular Design

Installing XTB and CREST

The task of designing organic photovoltaics and emitters will require the use of XTB, a program package of semi-empirical quantum mechanical methods, and CREST, a utility of xtb used to sample molecular conformers.

The binaries are provided in here. Place in home directory, and the software can be sourced using

export XTBHOME=${HOME}/xtb
export PATH=${PATH}:${XTBHOME}/bin
export XTBPATH=${XTBHOME}/share/xtb:${XTBHOME}:${HOME}
export MANPATH=${MANPATH}:${XTBHOME}/share/man

Installing SMINA

The task of designing molecules that dock to proteins requires the use of SMINA, a method for calcualte docking scores of ligands onto solved structures (proteins).

Datasets

All datasets are found in the datasets directory.

Task Dataset name Number of smiles Columns in file
Designing OPV hce.csv 24,953 HOMO-LUMO gap (↑) LUMO (↓) Dipole (↑) Combined objective (↑)
Designing OPV unbiased_hce.csv 1,000 HOMO-LUMO gap (↑) LUMO (↓) Dipole (↑) Combined objective (↑)
Designing emitters gdb13.csv 403,947 Singlet-triplet gap (↓) Oscillator strength (↑) Multi-objective (↑) Time (s)
Designing drugs docking.csv 152,296 1SYH (↓) 6Y2F (↑) 4LDE (↑) Time (s)
Designing drugs reactivity.csv 60,828

Getting started

Designing organic photovoltaics

To use the evaluation function, load either the full xtb calculation from the pce module, or use the surrogate model, with pretrained weights.

import pandas as pd
data = pd.read_csv('./datasets/hce.csv')   # or ./dataset/unbiased_hce.csv
smiles = data['smiles'].tolist()
smi = smiles[0]

## use full xtb calculation in hce module
from tartarus import pce
dipole, hl_gap, lumo, combined, pce_1, pce_2, sas = pce.get_properties(smi)

## use pretrained surrogate model
dipole, hl_gap, lumo, combined = pce.get_surrogate_properties(smi)

Designing Organic Emitters

import pandas as pd
data = pd.read_csv('./datasets/gdb13.csv')   # or ./dataset/unbiased_hce.csv
smiles = data['smiles'].tolist()
smi = smiles[0]

## use full xtb calculation in hce module
from tartarus import tadf
st, osc, combined = tadf.get_properties(smi)

Design of drug molecule

import pandas as pd
data = pd.read_csv('./datasets/docking.csv')   # or ./dataset/unbiased_hce.csv
smiles = data['smiles'].tolist()
smi = smiles[0]

## Design of Protein Ligands 
from tartarus import docking
st, osc, combined = docking.get_1syh_score(smi)
st, osc, combined = docking.get_6y2f_score(smi)
st, osc, combined = docking.get_4lde_score(smi)

Design of Chemical Reaction Substrates

import pandas as pd
data = pd.read_csv('./datasets/reactivity.csv')   # or ./dataset/unbiased_hce.csv
smiles = data['smiles'].tolist()
smi = smiles[0]

## calculating binding affinity for each protein
from tartarus import reactivity
activation_energy, reaction_energy, sa_score = reactivity.get_properties(smi)

About

A Benchmarking Platform for Realistic And Practical Inverse Molecular Design

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •