Skip to content

Commit a3d2f62

Browse files
authored
Merge pull request #51 from MobleyLab/nieldev_mobley
Incorporate minor edits of new cyclodextrin material into nieldev branch
2 parents 54e4ef0 + 8304ee2 commit a3d2f62

File tree

5 files changed

+95
-50
lines changed

5 files changed

+95
-50
lines changed

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
Copyright 2017, David L. Mobley and Michael K. Gilson
1+
Copyright 2017, David L. Mobley, Germano Heinzelmann, Niel M. Henriksen and Michael K. Gilson
22

33
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
44

README.md

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Benchmark sets for free energy calculations
22

3-
This repository relates to the *perpetual review* ([definition](https://arxiv.org/abs/1502.01329)) paper called "[Predicting binding free energies: Frontiers and benchmarks](https://github.com/MobleyLab/benchmarksets/blob/master/paper/benchmarkset.pdf)" by David L. Mobley and Michael K. Gilson.
3+
This repository relates to the *perpetual review* ([definition](https://arxiv.org/abs/1502.01329)) paper called "[Predicting binding free energies: Frontiers and benchmarks](https://github.com/MobleyLab/benchmarksets/blob/master/paper/benchmarkset.pdf)" by David L. Mobley, Germano Heinzelmann, Niel M. Henriksen, and Michael K. Gilson.
44
The repository's focus is benchmark sets for binding free energy calculations, including the perpetual review paper, but also supporting files and other materials relating to free energy benchmarks.
55
Thus, the repository includes not only the perpetual review paper but also further discussion, datasets, and (hopefully ultimately) standards for datasets and data deposition.
66

@@ -45,13 +45,14 @@ Currently proposed benchmark sets are detailed in [the paper](https://github.com
4545
* Host guest systems
4646
* CB7
4747
* Gibb deep cavity cavitands (GDCCs) OA and TEMOA
48+
* Cyclodextrins (alpha and beta)
4849
* Lysozyme model binding sites
4950
* apolar L99A
5051
* polar L99A/M102Q
52+
* Bromodomain BRD4-1
5153

5254
Other near-term candidates include:
5355
* Thrombin
54-
* Bromodomains
5556
* Suggest and vote on your favorites via a feature request below
5657

5758
Community involvement is needed to pick and advance the best benchmarks.
@@ -77,6 +78,7 @@ We also welcome contributions to the material which is already here to extend it
7778
## Authors
7879
- David L. Mobley (UCI)
7980
- Germano Heinzelmann (Universidade Federal de Santa Catarina)
81+
- Niel M. Henriksen (UCSD)
8082
- Michael K. Gilson (UCSD)
8183

8284
Your name, too, can go here if you help us substantially revise/extend the paper.
@@ -88,7 +90,7 @@ We want to thank the following people who contributed to this repository and the
8890

8991
- David Slochower (UCSD, Gilson lab): Grammar corrections and improved table formatting
9092
- Nascimento (in a comment on biorxiv): Highlighted PDB code error for n-phenylglycinonitrile
91-
- Jian Yin (UCSD, Gilson lab): Provided host-guest structures and input files for the host-guest sets described in the paper
93+
- Jian Yin (UCSD, Gilson lab): Provided host-guest structures and input files for the CB7 and GDCC host-guest sets described in the paper
9294

9395
Please note that GitHub's automatic "contributors" list does not provide a full accounting of everyone contributing to this work, as some contributions have been received by e-mail or other mechanisms.
9496

@@ -105,8 +107,14 @@ Please note that GitHub's automatic "contributors" list does not provide a full
105107
- v1.2 ([10.5281/zenodo.839047](http://doi.org/10.5281/zenodo.839047)): Addition of bromodomain BRD4(1) test cases as a new ``soft'' benchmark, with help from Germano Heinzelmann. Addition of Heinzelmann as an author. Addition of files for BRD4(1) benchmark. Removed bromodomain material from future benchmarks in view of its presence now as a benchmark system.
106108

107109
## Changes not yet in a release
110+
- Include cyclodextrin benchmarks to data and to paper; removal of most of cyclodextrin material from future benchmarks. Addition of Niel Henriksen as an author based on his work on this.
108111

109112
## Manifest
110113

111-
* paper: Provides LaTeX source files and final PDF for the current version of the manuscript (reformatted from the version submitted to Ann. Rev. and with 2D structures added to the tables); images, etc. are also available in sub-directories, as is the supporting information.
112-
* input_files: Host-guest structures and simulation input files for the host-guest benchmark sets proposed in the paper (provided by Jian Yin from the Gilson lab)
114+
* paper: Provides LaTeX source files and final PDF for the current version of the manuscript (reformatted and expanded from the version submitted to Ann. Rev. and with 2D structures added to the tables); images, etc. are also available in sub-directories, as is the supporting information.
115+
* input_files: Ultimately to include structures and simulation input files for all of the benchmark systems present as well as (we hope) gold standard calculated values for these files. Currently this includes:
116+
* README.md: A more extensive document describing the files present
117+
* BRD4 structures and simulation input files from Germano Heinzelmann
118+
* CB7 structures and simulation input files from Jian Yin (Gilson lab)
119+
* GDCC structures and simulation input files from Jian Yin (Gilson lab)
120+
* Cyclodextrin structures and simulation input files from Niel Henriksen (Gilson lab)

input_files/README.md

Lines changed: 47 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,60 @@
1-
# Benchmark Set Input Files
1+
# Benchmark Set Input Files and Supporting Data
2+
3+
This directory (and its subdirectories) provides structure and simulation input files for benchmark sets proposed in the associated perpetual review paper as well as (in some cases) data on additional supplementary compounds as well.
4+
5+
This file documents what files are present here and how they were generated.
6+
7+
## Manifest
8+
- `BRD4`: BRD4-1 benchmarks as proposed in two Tables in the paper; provides its own `README.md` detailing its organization/contents/provenance.
9+
- `cb7-set1` and `cb7-set2`: Proposed CB7 benchmark sets, with organization/contents/provenance information below.
10+
- `cd-set1` and `cd-set2`: Proposed cyclodextrin benchmark sets (the first of which is on alpha cyclodextrin and the second on beta cyclodextrin), with organization/contents/provenance information below. **Additional, supplementary guests are also provided.** These also contain a machine-parseable `README.md` file with experimental binding free energies and enthalpies, with references.
11+
- `gdcc-set1` and `gdcc-set2`: Proposed Gibb deep cavity cavitand benchmark sets, with organization/contents/provenance information below.
212

313
## File Descriptions
4-
This set of files comprises PDB, sdf, and mol2 files for the free hosts and guests, as well as AMBER prmtop/rst7 format input files for the solvated and equilibrated host-guest complexes. The guest compounds in each set are named according to the compound ID listed in Tables 1-6 in the associated paper [1]. For instance, compound p-toluidine is located in the cb7-set2 subdirectory and named guest-9 because it is included in CB7 Set 2, and its ID in the paper is 9. The prmtop/rst7 files are named in the same way, except that both the host identifier name and guest ID are used for the filename. For cd-set1 and cd-set2, there are two sets of prmtop/rst7 files, one for each possible orientation of the guest in the cyclodextrin cavity. In addition, the cyclodextrin datasets also include files with an 's' character prior to the guest ID number (e.g., bcd-s15.pdb), which indicates that these are supplemental guests not listed in the associated paper which could be of additional interest.
14+
This set of files comprises PDB, sdf, and mol2 files for the free hosts and guests, as well as AMBER prmtop/rst7 format input files for the solvated and equilibrated host-guest complexes.
15+
The guest compounds in each set are named according to the compound ID listed in the corresponding Tables in the associated paper [1].
16+
For instance, compound p-toluidine is located in the cb7-set2 subdirectory and named guest-9 because it is included in CB7 Set 2, and its ID in the paper is 9.
17+
The prmtop/rst7 files are named in the same way, except that both the host identifier name and guest ID are used for the filename.
18+
For cd-set1 and cd-set2, there are two sets of prmtop/rst7 files, one for each possible orientation of the guest in the cyclodextrin cavity.
19+
In addition, the cyclodextrin datasets also include files with an 's' character prior to the guest ID number (e.g., bcd-s15.pdb), which indicates that these are supplemental guests not listed in the associated paper which could be of additional interest.
520

621
## CB7 Methods
7-
The structures of the free CB7 host were initially obtained from the crystal structure [2] while all other unbound guest structures were built manually. The structues were then QM energy minimized in vacuo using the HF/6-31G(d) method in Gaussin09. The CB7 molecule has zero net charge, while the protonation states of the guests were predicted with the pKa plugin in the Marvin suite of programs [3]. Guest guest-18 in cb7-set1 is a special case, because it was predicted to have the protonated and unprotonated forms coexisting at the experimental pH value (4.74) [4] with a nearly 1:1 ratio. Therefore, files of both forms are provided, with guest-18 as the protonated form of the guest and guest-18b as the unprotonated form. For the simulation files, bonded and Lennard-Jones parameters were obtained from the general AMBER force field (GAFF v1.7) [5]. Partial charges for each atom were generated using the RESP procedure [6], as implemented in the Antechamber program [7], by fitting to electrostatic potentials grids generated during the QM minimization. The starting bound configuration of each host-guest pair was generated by docking the guests into the hosts with MOE [8]. The binding pose was then solvated in a cubic box with 1500 TIP3P water molecules with sodium or chloride counterions added only for neutralization. Counterions were modeled with the TIP3P-specific sodium parameters of Joung and Cheatham [9]. After an equilibration phase, an NVT simulation of 2 ns was carried out, and the frame with the most populated configuration, determined via clustering, was selected as the simulation input file.
22+
The structures of the free CB7 host were initially obtained from the crystal structure [2] while all other unbound guest structures were built manually.
23+
The structures were then QM energy minimized in vacuo using the HF/6-31G(d) method in Gaussian09.
24+
The CB7 molecule has zero net charge, while the protonation states of the guests were predicted with the pKa plugin in the Marvin suite of programs [3].
25+
Guest guest-18 in cb7-set1 is a special case, because it was predicted to have the protonated and unprotonated forms coexisting at the experimental pH value (4.74) [4] with a nearly 1:1 ratio.
26+
Therefore, files of both forms are provided, with guest-18 as the protonated form of the guest and guest-18b as the unprotonated form.
27+
For the simulation files, bonded and Lennard-Jones parameters were obtained from the general AMBER force field (GAFF v1.7) [5].
28+
Partial charges for each atom were generated using the RESP procedure [6], as implemented in the Antechamber program [7], by fitting to electrostatic potentials grids generated during the QM minimization.
29+
The starting bound configuration of each host-guest pair was generated by docking the guests into the hosts with MOE [8].
30+
The binding pose was then solvated in a cubic box with 1500 TIP3P water molecules with sodium or chloride counterions added only for neutralization.
31+
Counterions were modeled with the TIP3P-specific sodium parameters of Joung and Cheatham [9].
32+
After an equilibration phase, an NVT simulation of 2 ns was carried out, and the frame with the most populated configuration, determined via clustering, was selected as the simulation input file.
833

934

1035
## OA/TEMOA (GDCC) Methods
11-
The structures of the free hosts OA and TEMOA, as well as of all unbound guests, were built manually and then QM energy minimized in vacuo using the HF/6-31G(d) method in Gaussin09. The OA and TEMOA hosts both were assigned net charges of -8au, based on the pH at which the experiments were conducted (9.2 and 11.5) [10, 11]. The protonation states of guests were predicted with the pKa plugin in the Marvin suite of programs [3]. For the simulation files, bonded and LJ force field parameters were taken from GAFF v1.7 and partial charges were determined using the RESP approach, in identical fashion to the CB7 method. The starting bound configuration of each host-guest pair was generated by docking the guests into the hosts with MOE [8]. The binding pose was then solvated in a cubic box with 2100 TIP3P water molecules with sodium or chloride counterions added only for neutralization. Counterions were modeled with the TIP3P-specific sodium parameters of Joung and Cheatham [9]. After an equilibration phase, an NVT simulation of 2 ns was carried out, and the frame with the most populated configuration, determined via clustering, was selected as the simulation input file.
36+
The structures of the free hosts OA and TEMOA, as well as of all unbound guests, were built manually and then QM energy minimized in vacuo using the HF/6-31G(d) method in Gaussin09.
37+
The OA and TEMOA hosts both were assigned net charges of -8au, based on the pH at which the experiments were conducted (9.2 and 11.5) [10, 11].
38+
The protonation states of guests were predicted with the pKa plugin in the Marvin suite of programs [3].
39+
For the simulation files, bonded and LJ force field parameters were taken from GAFF v1.7 and partial charges were determined using the RESP approach, in identical fashion to the CB7 method.
40+
The starting bound configuration of each host-guest pair was generated by docking the guests into the hosts with MOE [8].
41+
The binding pose was then solvated in a cubic box with 2100 TIP3P water molecules with sodium or chloride counterions added only for neutralization.
42+
Counterions were modeled with the TIP3P-specific sodium parameters of Joung and Cheatham [9].
43+
After an equilibration phase, an NVT simulation of 2 ns was carried out, and the frame with the most populated configuration, determined via clustering, was selected as the simulation input file.
1244

1345

1446
## CD Methods
15-
The stuctures of unbound alpha-cyclodextrin (aCD) and beta-cyclodextrin (bCD), as well as all guests, were built manually. Protonation states followed what was reported by Rekharsky et al. [12] at pH 6.9. The guest molecules were QM energy minimized in vacuo using the HF/6-31G(d) method in Gaussian09. Partial charges, LJ paramters, and bonded parameters for the CD molecules were taken from the q4md-CD force field by Cézard et al. [13]. Guest partial charges were derived using the RESP method implemented in the R.E.D. Server tool [14], while LJ and bonded parameters were taken from GAFF v1.7. The AMBER simulation files consist of the host-guest complex, 1500 TIP3P waters, and three Na+ and Cl- ions in addition to any counterions required for neutralization. This roughly corresponds to the ionic strength of the 50 mmol phosphate buffer used in experiment. The solvated systems were equilibrated in the NPT ensemble with light (0.1 kcal/mol) positional restraints on the host and guest atoms. The final conformation of this equilibration step is provided here. Unrestrained equilibration and clustering was not performed for the cyclodextrin sets, in contrast to the CB7 and GDCC sets, because in some cases the guest binds weakly enough that it could leave the binding cavity for significant periods of time. To account for the two possible orientations of the guest within the CD cavity, simulation files with the '-p' suffix indicate that the guest is bound with the polar functional group oriented out of the primary (narrow) face of the CD, whereas the '-s' suffix indicates the guest polar functional group is oriented out of the secondary (wider) face of the CD.
47+
The stuctures of unbound alpha-cyclodextrin (aCD) and beta-cyclodextrin (bCD), as well as all guests, were built manually.
48+
Protonation states followed what was reported by Rekharsky et al. [12] at pH 6.9.
49+
The guest molecules were QM energy minimized in vacuo using the HF/6-31G(d) method in Gaussian09.
50+
Partial charges, LJ paramters, and bonded parameters for the CD molecules were taken from the q4md-CD force field by Cézard et al. [13].
51+
Guest partial charges were derived using the RESP method implemented in the R.E.D. Server tool [14], while LJ and bonded parameters were taken from GAFF v1.7.
52+
The AMBER simulation files consist of the host-guest complex, 1500 TIP3P waters, and three Na+ and Cl- ions in addition to any counterions required for neutralization.
53+
This roughly corresponds to the ionic strength of the 50 mmol phosphate buffer used in experiment.
54+
The solvated systems were equilibrated in the NPT ensemble with light (0.1 kcal/mol) positional restraints on the host and guest atoms.
55+
The final conformation of this equilibration step is provided here.
56+
Unrestrained equilibration and clustering was not performed for the cyclodextrin sets, in contrast to the CB7 and GDCC sets, because in some cases the guest binds weakly enough that it could leave the binding cavity for significant periods of time.
57+
To account for the two possible orientations of the guest within the CD cavity, simulation files with the '-p' suffix indicate that the guest is bound with the polar functional group oriented out of the primary (narrow) face of the CD, whereas the '-s' suffix indicates the guest polar functional group is oriented out of the secondary (wider) face of the CD.
1658

1759
## BRD4
1860
For information on the BRD4 benchmark, see the associated `README.md` file in the BRD4 subdirectory.

paper/benchmarkset.pdf

-384 Bytes
Binary file not shown.

0 commit comments

Comments
 (0)