|
1 |
| -# Benchmark Set Input Files |
| 1 | +# Benchmark Set Input Files and Supporting Data |
| 2 | + |
| 3 | +This directory (and its subdirectories) provides structure and simulation input files for benchmark sets proposed in the associated perpetual review paper as well as (in some cases) data on additional supplementary compounds as well. |
| 4 | + |
| 5 | +This file documents what files are present here and how they were generated. |
| 6 | + |
| 7 | +## Manifest |
| 8 | +- `BRD4`: BRD4-1 benchmarks as proposed in two Tables in the paper; provides its own `README.md` detailing its organization/contents/provenance. |
| 9 | +- `cb7-set1` and `cb7-set2`: Proposed CB7 benchmark sets, with organization/contents/provenance information below. |
| 10 | +- `cd-set1` and `cd-set2`: Proposed cyclodextrin benchmark sets (the first of which is on alpha cyclodextrin and the second on beta cyclodextrin), with organization/contents/provenance information below. **Additional, supplementary guests are also provided.** These also contain a machine-parseable `README.md` file with experimental binding free energies and enthalpies, with references. |
| 11 | +- `gdcc-set1` and `gdcc-set2`: Proposed Gibb deep cavity cavitand benchmark sets, with organization/contents/provenance information below. |
2 | 12 |
|
3 | 13 | ## File Descriptions
|
4 |
| -This set of files comprises PDB, sdf, and mol2 files for the free hosts and guests, as well as AMBER prmtop/rst7 format input files for the solvated and equilibrated host-guest complexes. The guest compounds in each set are named according to the compound ID listed in Tables 1-6 in the associated paper [1]. For instance, compound p-toluidine is located in the cb7-set2 subdirectory and named guest-9 because it is included in CB7 Set 2, and its ID in the paper is 9. The prmtop/rst7 files are named in the same way, except that both the host identifier name and guest ID are used for the filename. For cd-set1 and cd-set2, there are two sets of prmtop/rst7 files, one for each possible orientation of the guest in the cyclodextrin cavity. In addition, the cyclodextrin datasets also include files with an 's' character prior to the guest ID number (e.g., bcd-s15.pdb), which indicates that these are supplemental guests not listed in the associated paper which could be of additional interest. |
| 14 | +This set of files comprises PDB, sdf, and mol2 files for the free hosts and guests, as well as AMBER prmtop/rst7 format input files for the solvated and equilibrated host-guest complexes. |
| 15 | +The guest compounds in each set are named according to the compound ID listed in the corresponding Tables in the associated paper [1]. |
| 16 | +For instance, compound p-toluidine is located in the cb7-set2 subdirectory and named guest-9 because it is included in CB7 Set 2, and its ID in the paper is 9. |
| 17 | +The prmtop/rst7 files are named in the same way, except that both the host identifier name and guest ID are used for the filename. |
| 18 | +For cd-set1 and cd-set2, there are two sets of prmtop/rst7 files, one for each possible orientation of the guest in the cyclodextrin cavity. |
| 19 | +In addition, the cyclodextrin datasets also include files with an 's' character prior to the guest ID number (e.g., bcd-s15.pdb), which indicates that these are supplemental guests not listed in the associated paper which could be of additional interest. |
5 | 20 |
|
6 | 21 | ## CB7 Methods
|
7 |
| -The structures of the free CB7 host were initially obtained from the crystal structure [2] while all other unbound guest structures were built manually. The structues were then QM energy minimized in vacuo using the HF/6-31G(d) method in Gaussin09. The CB7 molecule has zero net charge, while the protonation states of the guests were predicted with the pKa plugin in the Marvin suite of programs [3]. Guest guest-18 in cb7-set1 is a special case, because it was predicted to have the protonated and unprotonated forms coexisting at the experimental pH value (4.74) [4] with a nearly 1:1 ratio. Therefore, files of both forms are provided, with guest-18 as the protonated form of the guest and guest-18b as the unprotonated form. For the simulation files, bonded and Lennard-Jones parameters were obtained from the general AMBER force field (GAFF v1.7) [5]. Partial charges for each atom were generated using the RESP procedure [6], as implemented in the Antechamber program [7], by fitting to electrostatic potentials grids generated during the QM minimization. The starting bound configuration of each host-guest pair was generated by docking the guests into the hosts with MOE [8]. The binding pose was then solvated in a cubic box with 1500 TIP3P water molecules with sodium or chloride counterions added only for neutralization. Counterions were modeled with the TIP3P-specific sodium parameters of Joung and Cheatham [9]. After an equilibration phase, an NVT simulation of 2 ns was carried out, and the frame with the most populated configuration, determined via clustering, was selected as the simulation input file. |
| 22 | +The structures of the free CB7 host were initially obtained from the crystal structure [2] while all other unbound guest structures were built manually. |
| 23 | +The structures were then QM energy minimized in vacuo using the HF/6-31G(d) method in Gaussian09. |
| 24 | +The CB7 molecule has zero net charge, while the protonation states of the guests were predicted with the pKa plugin in the Marvin suite of programs [3]. |
| 25 | +Guest guest-18 in cb7-set1 is a special case, because it was predicted to have the protonated and unprotonated forms coexisting at the experimental pH value (4.74) [4] with a nearly 1:1 ratio. |
| 26 | +Therefore, files of both forms are provided, with guest-18 as the protonated form of the guest and guest-18b as the unprotonated form. |
| 27 | +For the simulation files, bonded and Lennard-Jones parameters were obtained from the general AMBER force field (GAFF v1.7) [5]. |
| 28 | +Partial charges for each atom were generated using the RESP procedure [6], as implemented in the Antechamber program [7], by fitting to electrostatic potentials grids generated during the QM minimization. |
| 29 | +The starting bound configuration of each host-guest pair was generated by docking the guests into the hosts with MOE [8]. |
| 30 | +The binding pose was then solvated in a cubic box with 1500 TIP3P water molecules with sodium or chloride counterions added only for neutralization. |
| 31 | +Counterions were modeled with the TIP3P-specific sodium parameters of Joung and Cheatham [9]. |
| 32 | +After an equilibration phase, an NVT simulation of 2 ns was carried out, and the frame with the most populated configuration, determined via clustering, was selected as the simulation input file. |
8 | 33 |
|
9 | 34 |
|
10 | 35 | ## OA/TEMOA (GDCC) Methods
|
11 |
| -The structures of the free hosts OA and TEMOA, as well as of all unbound guests, were built manually and then QM energy minimized in vacuo using the HF/6-31G(d) method in Gaussin09. The OA and TEMOA hosts both were assigned net charges of -8au, based on the pH at which the experiments were conducted (9.2 and 11.5) [10, 11]. The protonation states of guests were predicted with the pKa plugin in the Marvin suite of programs [3]. For the simulation files, bonded and LJ force field parameters were taken from GAFF v1.7 and partial charges were determined using the RESP approach, in identical fashion to the CB7 method. The starting bound configuration of each host-guest pair was generated by docking the guests into the hosts with MOE [8]. The binding pose was then solvated in a cubic box with 2100 TIP3P water molecules with sodium or chloride counterions added only for neutralization. Counterions were modeled with the TIP3P-specific sodium parameters of Joung and Cheatham [9]. After an equilibration phase, an NVT simulation of 2 ns was carried out, and the frame with the most populated configuration, determined via clustering, was selected as the simulation input file. |
| 36 | +The structures of the free hosts OA and TEMOA, as well as of all unbound guests, were built manually and then QM energy minimized in vacuo using the HF/6-31G(d) method in Gaussin09. |
| 37 | +The OA and TEMOA hosts both were assigned net charges of -8au, based on the pH at which the experiments were conducted (9.2 and 11.5) [10, 11]. |
| 38 | +The protonation states of guests were predicted with the pKa plugin in the Marvin suite of programs [3]. |
| 39 | +For the simulation files, bonded and LJ force field parameters were taken from GAFF v1.7 and partial charges were determined using the RESP approach, in identical fashion to the CB7 method. |
| 40 | +The starting bound configuration of each host-guest pair was generated by docking the guests into the hosts with MOE [8]. |
| 41 | +The binding pose was then solvated in a cubic box with 2100 TIP3P water molecules with sodium or chloride counterions added only for neutralization. |
| 42 | +Counterions were modeled with the TIP3P-specific sodium parameters of Joung and Cheatham [9]. |
| 43 | +After an equilibration phase, an NVT simulation of 2 ns was carried out, and the frame with the most populated configuration, determined via clustering, was selected as the simulation input file. |
12 | 44 |
|
13 | 45 |
|
14 | 46 | ## CD Methods
|
15 |
| -The stuctures of unbound alpha-cyclodextrin (aCD) and beta-cyclodextrin (bCD), as well as all guests, were built manually. Protonation states followed what was reported by Rekharsky et al. [12] at pH 6.9. The guest molecules were QM energy minimized in vacuo using the HF/6-31G(d) method in Gaussian09. Partial charges, LJ paramters, and bonded parameters for the CD molecules were taken from the q4md-CD force field by Cézard et al. [13]. Guest partial charges were derived using the RESP method implemented in the R.E.D. Server tool [14], while LJ and bonded parameters were taken from GAFF v1.7. The AMBER simulation files consist of the host-guest complex, 1500 TIP3P waters, and three Na+ and Cl- ions in addition to any counterions required for neutralization. This roughly corresponds to the ionic strength of the 50 mmol phosphate buffer used in experiment. The solvated systems were equilibrated in the NPT ensemble with light (0.1 kcal/mol) positional restraints on the host and guest atoms. The final conformation of this equilibration step is provided here. Unrestrained equilibration and clustering was not performed for the cyclodextrin sets, in contrast to the CB7 and GDCC sets, because in some cases the guest binds weakly enough that it could leave the binding cavity for significant periods of time. To account for the two possible orientations of the guest within the CD cavity, simulation files with the '-p' suffix indicate that the guest is bound with the polar functional group oriented out of the primary (narrow) face of the CD, whereas the '-s' suffix indicates the guest polar functional group is oriented out of the secondary (wider) face of the CD. |
| 47 | +The stuctures of unbound alpha-cyclodextrin (aCD) and beta-cyclodextrin (bCD), as well as all guests, were built manually. |
| 48 | +Protonation states followed what was reported by Rekharsky et al. [12] at pH 6.9. |
| 49 | +The guest molecules were QM energy minimized in vacuo using the HF/6-31G(d) method in Gaussian09. |
| 50 | +Partial charges, LJ paramters, and bonded parameters for the CD molecules were taken from the q4md-CD force field by Cézard et al. [13]. |
| 51 | +Guest partial charges were derived using the RESP method implemented in the R.E.D. Server tool [14], while LJ and bonded parameters were taken from GAFF v1.7. |
| 52 | +The AMBER simulation files consist of the host-guest complex, 1500 TIP3P waters, and three Na+ and Cl- ions in addition to any counterions required for neutralization. |
| 53 | +This roughly corresponds to the ionic strength of the 50 mmol phosphate buffer used in experiment. |
| 54 | +The solvated systems were equilibrated in the NPT ensemble with light (0.1 kcal/mol) positional restraints on the host and guest atoms. |
| 55 | +The final conformation of this equilibration step is provided here. |
| 56 | +Unrestrained equilibration and clustering was not performed for the cyclodextrin sets, in contrast to the CB7 and GDCC sets, because in some cases the guest binds weakly enough that it could leave the binding cavity for significant periods of time. |
| 57 | +To account for the two possible orientations of the guest within the CD cavity, simulation files with the '-p' suffix indicate that the guest is bound with the polar functional group oriented out of the primary (narrow) face of the CD, whereas the '-s' suffix indicates the guest polar functional group is oriented out of the secondary (wider) face of the CD. |
16 | 58 |
|
17 | 59 | ## BRD4
|
18 | 60 | For information on the BRD4 benchmark, see the associated `README.md` file in the BRD4 subdirectory.
|
|
0 commit comments