-
Notifications
You must be signed in to change notification settings - Fork 34
CADDSuiteDocking
cschaerfe edited this page Jul 8, 2015
·
3 revisions
Preparation steps for docking Before the actual docking can be run, several preparation steps are necessary. Input molecules must be preprocessed and checked for errors, sensible constraints for docking should be created and score grids for docking have to be precalculated. A useful preparation pipeline might thus look like the following example.

In this example, the following steps are performed:
- PDBCutter can be used to extract the pure protein (and co-crystallized ligand) from a PDB file. This can be used prior to the docking preparation workflow in order to remove extra chains from the model, separate protein from reference ligand to save both in separate files, and remove residues such as metal ions from the structure file.
- ProteinProtonator is used to protonate the given receptor structure. This is important since the presence of hydrogen atoms is necessary for the scoring function (as e.g. utilized by docking) to correctly work, but hydrogens are in most cases not observed in crystal structures (unless they are obtained at very high resolution). Hydrogens that appear in the input file are ignored. The pH at which the protonation is to be done can furthermore be specified (with parameter '-ph' on the command line). The output of the ProteinProtonator is a pdb-file containing the protonated receptor structure.
- WaterFinder is used to analyze the given co-crystal structure in order to find water molecules that bind very strongly to the receptor or that interact strongly with receptor and reference ligand, thus functioning as a water bridge. Water molecules in the input pdb-structure (i.e. single oxygens) are automatically protonated and rotationally optimized before the search is done. Networks formed by a number of water molecules are also taken into consideration, i.e. water molecules belonging a network that has a least one member fulfilling the aforementioned criteria are also retained.
- The pdb-file written as output of WaterFinder then only contains the selected water molecules. In principle, this step could be skipped (if all water molecules are simply removed from the receptor structure), but since water molecules are often important for binding of ligands, it is advisable to use WaterFinder. ProteinCheck performs a simple chemical sanity check on the receptor structure. It tests whether all atoms have a valid assigned element, no bonds have a strange length, the protein has been protonated and whether there are heavy atom clashes inside the receptor. ProteinCheck generates a pdf-file containing the results of those tests, and a secondary structure prediction, a Ramachandran plot and a temperature factor plot. All of these can help to evaluate the quality of the protein structure.
- In order to automatically generate interaction constraints, ConstraintFinder can be used. This tool evaluates the interaction between a reference ligand and each residue of the receptor. The residues with which the reference ligand interacts most strongly are (if the interaction is significantly strong) used as constraints to be used during docking. Thus, ligand poses will be penalized during docking if they interact significantly weaker with those residues than the reference ligand.
- The reference ligand should be taken from the structural file of your protein-ligand complex so its atom coordinates place it correctly in the binding pocket.
- Output of ConstraintsFinder is a configuration file containing the constraint definitions. This file should be passed to GridBuilder and IMGDock. However, use of ConstraintFinder may be skipped, especially if finding ligands is desired that bind to the receptor in a mode very different to the one of the reference ligand.
- PocketDetector is utilized to generate a spatial description of the binding pocket that is used as a constraint during docking. Hence, probe atoms are placed above the protein surface at positions of relative deep burial. The cluster of probe atoms around the geometric center of the reference ligand is used for the description of the binding pocket.
- The output of this tool is a docking configuration file that contains the description of the detected binding pocket. In subsequent pipeline steps this file should be specified to docking and rescoring tools (e.g. IMGDock). Again, the reference ligand used to facilitate the pocket detection should be taken from the structural file of your protein-ligand complex so its atom coordinates place it correctly in the binding pocket. On the command line this pipeline could look like this:
BALL/build/bin/TOOLS/ProteinProtonator -i receptor.pdb -o rec-prot.pdb
BALL/build/bin/TOOLS/WaterFinder -i rec-prot.pdb -rl xlig.mol2 -o rec-prot-wat.pdb
BALL/build/bin/TOOLS/ProteinCheck -i rec-prot-wat.pdb -o rec-prot-wat.pdf
BALL/build/bin/TOOLS/ConstraintsFinder -rec rec-prot-wat.pdb -rl xlig.mol2 -o constraints.ini
BALL/build/bin/TOOLS/PocketDetector -rec rec-prot-wat.pdb -rl xlig.mol2 -option constraints.ini -o const+pocket.ini
BALL/build/bin/TOOLS/GridBuilder -rec rec-prot-wat.pdb -rl xlig.mol2 -pocket const+pocket.ini -grd scoregrids.grd.gz
Docking
After the aforementioned steps have been performed, compounds can be docked into the binding pocket of the receptor using the tool IMGDock and the score-grid file created as described above.

- Ligand3DGenerator should be used first on the set of compounds that are to be docked. This tools protonates all molecules and generates 3D conformations for each of them. This is important since molecules obtained from many sources often contain only 2D conformations and are lacking hydrogens. If the input file contains a mix of ligands in 2D and 3D, the Ligand3DGenerator can be forced to check the 3D conformations and re-calculate them only if they don't meet certain requirements (such as reasonable bond lengths etc.) using the command line flag -k.
- LigCheck then performs chemical sanity checks on the compounds. It checks for sensible bond-lengths, valid assigned elements and tests whether each 'molecule' in the input file contains only one actual molecule, i.e. it assess whether there are no unconnected atoms or fragments. Furthermore, each conformation (or, optionally, each topology) may appear only once within the given file. All molecules that pass these checks are written to the output file.
- IMGDock can then be run with the set of checked molecules and the score-grid file generated by GridBuilder. Furthermore, receptor and reference ligand have to be specified. These have to be the same molecules that were used during the preparation steps (see above).
- Optionally, a configuration file containing constraint definitions (as generated by the aforementioned preparation tools) can be specified. IMGDock then docks each compound of the input set into the binding pocket. For each average-sized ligand this will take approximately 30-60 seconds.
- The output of IMGDock is a file containing all compounds docked into the binding pocket, with a property-tag named 'score' indicating the score (an estimate for the binding free energy) obtained for each compound.
- DockResultMerger can be used afterwards to merge multiple docking results (if the process was parallelized) and/or sort the output of IMGDock ascendingly according to their score. Optionally, it can also be told to output just a number of top-scored compounds.
- MolDepict finally allows to generate a pdf-file with the structural diagrams of the (top-scored) output molecules. In addition to the structural diagrams, the pdf will also contain information about the score and the molecular weight of each compound.
On the command line, this pipeline can be run by something like:
BALL/build/bin/TOOLS/Ligand3DGenerator -i compounds_to_be_docked.sdf -o compounds_3D.sdf
BALL/build/bin/TOOLS/LigCheck -i compounds_3D.sdf -ri -o valid_compounds.sdf
BALL/build/bin/TOOLS/IMGDock -i valid_compounds.sdf -rec rec-prot-wat.pdb -rl xlig.mol2 -pocket const+pocket.ini -grd scoregrids.grd.gz -o docked_compounds.sdf
BALL/build/bin/TOOLS/DockResultMerger -i docked_compounds.sdf -s score -k 50 -o top50_sorted.sdf
BALL/build/bin/TOOLS/MolDepict -i top50_sorted.sdf -o top50_sorted.pdf