Flexible refinement in explicit solvent (water or DMSO)
Before starting
Before starting the docking, you need to specify a number of parameters, such as:
- Number of structures to generate and refine
- Histidine protonation state
- Flexible segments
- Which kind of restraints to use and associated parameters
- Solvated docking options
- Scoring scheme
- Electrostatic treatment
- ...
Many of those have default values which you do not need to change.
Using your web browser, go to the
HADDOCK home-page,
choose project
setup and enter the path of your run.cns file.
Click on "edit file". A new window will be created.
For a description of the run.cns file, refer to the
"run.cns file" section.
After setting all the parameters save the file as "run.cns" in your run
directory using the "save updated file" button.
Note: If you have turned on the use of DNA/RNA restraints in run.cns
HADDOCK expects to find a file called dna-rna_restraints.def in
the data/sequence directory. This files allows you to define
standard A-, B- or custom restraints for DNA such as base-pairing,
puckering and backbone dihedral angles. You can edit a template file
that can be found in the protocols directory and save it
as dna-rna_restraint.def into the data/sequence directory.
Using your web browser, go to the
HADDOCK home-page,
choose project
setup and enter the path of the dna-rna_restraints.def file.
Click on "edit file". A new window will be created.
When all necessary files and parameters have been properly edited and saved
then start HADDOCK in the run directory by typing:
haddock2.1
The entire protocol consists of four stages:
- Topologies and structures generation
- Randomization of the starting orientation and
rigid body energy minimization
- Semi-flexible simulated annealing
- Flexible refinement in explicit solvent (water or DMSO)
1. Topologies and structures generation
The first step in HADDOCK in the generation of the CNS topologies and coordinates
files for the various molecules and for the complex from the input PDB files (see section
PDB files). HADDOCK should automatically
recognize chain breaks, disulphide bonds, cis-prolines and even ions provided they are named
as defined in the ion.top topology file located in the toppar directory.
Job files will be generated in
the run directory and the topologies, structures and output files will be
generated in the begin directory. HADDOCK will use the fileroot
names specified in the run.cns file.
The following scripts will be run:
- fileroot_generate_A/B/C/D/E/F.job: Generates the CNS topology and coordinates
file(s) (if starting from an ensemble) for the various molecules.
Output files:
- fileroot_A/B/C/D/E/F.psf (topology)
- fileroot_A/B/C/D/E/F.pdb (coordinates)
- fileroot_A/B/C/D/E/F_1.pdb, fileroot_A/B/C/D/E/F_2.pdb, ...
(if starting from an ensemble)
- file_A/B/C/D/E/F.list (list of PDB coordinates files)
CNS scripts called (depending on the options defined):
- generate_A/B/C/D/E/F.inp
- dna_break.cns
- dna-rna_restraints.def
- flex_segment_back.cns
- iterations.cns
- prot_break.cns
- run.cns
Note: If solvated docking is turned on, generate_A/B/C/D/E/F-water.inp will be used instead
which calls in addition generate_water.cns,rotate_pdb.cns
and generates additional output pdb files containing the water
(fileroot_A/B/C/D/E/F_1_water.pdbw, ...)
- fileroot_generate_complex.job: Generates the CNS topology and coordinates
file(s) for the complex by merging the various topologies and coordinates files. When
starting from ensembles, all combinations will be generated.
Output files:
- fileroot.psf (topology)
- fileroot.pdb (coordinates)
- fileroot_1.pdb, fileroot_2.pdb, ...
(if starting from an ensemble)
- file.cns, file.list, file.nam (list of PDB coordinates files)
CNS scripts called:
- generate_complex.inp
- iterations.cns
- run.cns
- separate.cns
Note: If solvated docking is turned on, generate_complex-water.inp will
be used instead which will generates additional output pdb files containing the
water (fileroot_1_water.pdbw ,...)
In case of problems (and in general to make sure that everything is OK) look
into the output files generated (.out) for error messages (search for ERR).
2. Randomization and rigid body energy minimization:
The first docking step in HADDOCK is a rigid body energy minimization.
First the molecules will be separated by a minimum of 25A and rotated randomly around their center of mass. This randomization step can be turned off in the run.cns parameter file.
If you wish to decrease (or increase) the separation distance between the
two molecules, edit in the protocols directory the
random_rotations.cns CNS script and change the value of the $minispacing parameter.
The rigid body minimization is performed stepwise:
For each starting structure combination, this step can be repeated a number of
times (given by the Ntrials parameter in the run.cns
parameter file, and only the best solution is written to disk.
A new option in HADDOCK 2.x allows to systematically sample 180 degree rotated solutions. Since in our
experience symmetrical solutions occur quite often, sampling of 180 degree rotated solutions can increase
the success rate significantly. This option can be turned on and off with the rotate180_0 parameter
in the run.cns parameter file.
Note: The translational minimization can be turned off in
run.cns. This option can be useful for example for small
flexible molecules to perform the docking during the simulated annealing stage allowing
conformational changes to take place during the docking process. The number of steps in
the first two stages of the simulated annealing should then be increased by at least
a factor four to allow the molecules to approach each other.
The refine.inp CNS script is used for this step.
When all structures have been generated (typically in the order of 1000 to 2000
depending on the number of starting conformations and your CPU resources), HADDOCK
will sort them accordingly to the criterion defined in the
run.cns parameter file and write the sorted PDB
filenames into file.cns, file.list and file.nam in the
structures/it0 directory (see also the scoring section
of the online manual). These will be used for the next step (semi-flexible simulated annealing).
3. Semi-flexible simulated annealing
The best XXX structures after rigid body docking (typically 200, but this is left to the user's
choice (see the run.cns file section)) will be subjected to a
semi-flexible simulated annealing (SA) in torsion angle space. This semi-flexible annealing
consists of several stages:
- High temperature rigid body search
- Rigid body SA
- Semi-flexible SA with flexible side-chains at the interface
- Semi-flexible SA with fully flexible interface (both backbone and side-chains)
The temperatures and number of steps for the various stages are defined in the
run.cns parameter file.
Note1: HADDOCK also allows the definition of fully flexible regions (defined by
the nfle_X parameter in run.cns)
that remain fully flexible throughout all four stages. This
should be useful for cases where part of a structure are
disordered or unstructured or when docking small flexible
ligands or peptides onto a protein. This option also allows the use of
HADDOCK for structure calculations of complexes when classical
NMR restraints are available to drive the folding.
Note2: A new option in HADDOCK 2.x allows to automatically define the semi-flexible
regions by considering all residues within 5A of another molecule. To use this
option, set nseg_X to -1 in run.cns (or another
negative number if you still want to define manually segments for
random AIRs definition from a limited
region of the surface). This can be set for each molecule separately.
The refine.inp CNS script is used for this step.
At the end of the calculation, HADDOCK generates the file.cns, file.list
and file.nam files containing the filenames of the generated structures sorted
accordingly to the criterion defined in the run.cns
parameter file (see also the scoring section
of the online manual).
At the end of this stage, the structures are analyzed and the results
are placed in the structures/it1/analysis directory (see the
analysis section).
4. Flexible explicit solvent refinement
In this final step, the structures obtained after the semi-flexible simulated annealing are
refined in an explicit solvent layer (8A for water, 12.5A for DMSO). In this step, no spectacular
changes are expected, however, the scoring of the various structures is improved.
The re_h2o.inp or re_dmso.inp CNS script is used for this step.
At the end of the explicit solvent refinement, HADDOCK generates the file.cns, file.list
and file.nam files containing the filenames of the generated structures sorted
accordingly to the criterion defined in the run.cns
parameter file (see also the scoring section
of the online manual).
Finally, the structures are analyzed and the results are placed in the
structures/it1/water/analysis directory (see the
analysis section).
Please send any suggestions or enquiries to Alexandre Bonvin