Docking benchmark 5 (BM5) - cleaned and ready to use for HADDOCK
This is the docking and binding affinity benchmark described in:
T Vreven, I.H. Moal, A. Vangone, B.G. Pierce, P.L. Kastritis, M. Torchala, R. Chaleil,
B. Jimenez-Garcia, P.A. Bates, Juan Fernandez-Recio, A.M.J.J. Bonvin and Z. Weng.
Updates to the integrated protein-protein interaction benchmarks: Docking benchmark version 5 and affinity benchmark version 2.
J. Mol. Biol. 19, 3031-3041 (2015).
https://doi.org/doi:10.1016/j.jmb.2015.07.016
The repository contains the following information:
The directory contains HADDOCK-ready files for each entry of BM5. Each sub-directory (one per complex) contains the following files:
XXX_r_u.pdb
: Unbound receptor PDBXXX_l_u.pdb
: Unbound ligand PDBXXX_r_b-matched.pdb
: Matched bound receptor PDBXXX_l_b-matched.pdb
: Matched bound ligang PDBXXX_r_u_cg.pdb
: Martini v2 coarse-grain models for the receptor proteinsXXX_l_u_cg.pdb
: Martini v2 coarse-grain models for the ligand proteinsXXX_reference.pdb
: the reference, matched complex
And for each PDB there is an associated .info
file providing statistics of the PDB content
Various distance restraints files are present:
XXX_ambig.tbl
: Ambiguous interaction restraints based on the true interface measured with a 3.9A cutoffXXX_ambig5.tbl
: Ambiguous interaction restraints based on the true interface measured with a 5.0A cutoffXXX_restraint-bodies.tbl
: If present, contains a list of distance restraints to keep unconnected bodies togetherXXX_XXX_*.tbl
: If present, contains a list of distance restraints to keep the ligand in place in the structureXXX_hbonds.tbl
: the combination of the bodies and ligand distance restraints (used in HADDOCK)
And if there is a co-factor or ligand in the structure:
XXX_ligand.param
: the ligand parameter file as generated by PRODRGXXX_ligand.top
: the ligand topology file as generated by PRODRG
Further each sub-directory contains an ana_scripts
directory containing analysis scripts:
target.pdb
: the reference, matched complextarget-unbound.pdb
: the unbound complex built by superimposing the unbound structures onto the reference complextarget.contacts5
: intermolecular contacts at 5A cutoff used for calculating the fraction of native contactstarget.izone
: the interface definition for i-RMSD calculations with ProFit (derived from all residue contacts at 10A)target.izoneA
: same astarget.izone
but for chainA onlytarget.izoneB
: same astarget.izone
but for chainB onlytarget.lzone
: the zone definition for l-RMSD calculations with ProFiti-rmsd_to_xray.csh
: csh script to calculate i-RMSDs with ProFit from HADDOCKfile.nam
filesl-rmsd_to_xray.csh
: csh script to calculate l-RMSDs with ProFit from HADDOCKfile.nam
filesfraction-native.csh
" csh script to calculate the fraction of native contacts from HADDOCKfile.nam
filescluster-fnat.csh
: a script that generate cluster stats including RMSD and Fnat valuesrun_all.csh
: a csh script that runs the complete analysis of all three stages of HADDOCK
Note that the paths in the various analysis scripts must be adapted to your directory structure.
This can be done by running the scripts/setup-ana_scripts.csh
script with as argument the directory name of all entries
Finally, the HADDOCK-ready
directory also contains pre-calculated i-RMSD values for the superimposed unbound structures onto the reference complex and for each separate interface:
i-RMSD.dat
: Interface RMSD unbound superimposed versus reference, sorted in the order of the directory listingi-RMSD-sorted.dat
: Interface RMSD unbound superimposed versus reference, sorted from small to largei-RMSD_r.dat
: Interface RMSD of the unbound receptor interface versus reference, sorted in the order of the directory listingi-RMSD_r-sorted.dat
: Interface RMSD of the unbound receptor interface superimposed versus reference, sorted from small to largei-RMSD_l.dat
: Interface RMSD of the unbound ligand interface versus reference, sorted in the order of the directory listingi-RMSD_l-sorted.dat
: Interface RMSD of the unbound ligand interface superimposed versus reference, sorted from small to large
Other sub-directories:
scripts
: directory containing various scripts used for generating restraint files, initial analysis and automation of running HADDOCK. Refer to the README file in that directory for detailsdata
: referencerun.cns
and patch files to setup HADDOCK runs for various scenarios
This directory contains the matched PDB files for all entries of the benchmark. All structures (bound or unbound) consist of a unique chain (A for the receptor, B for the ligand) with non overlapping numbering.
The bound forms have been matched to the unbound, meaning that they have the same residue numbering and only contain residues matching residues in the unbound forms.
For each entry XXX the following files are present:
XXX_r_u.pdb
: Unbound receptor PDBXXX_l_u.pdb
: Unbound ligand PDBXXX_r_b-matched.pdb
: Matched bound receptor PDBXXX_l_b-matched.pdb
: Matched bound ligang PDB
And for each PDB there is an associated .info
file providing statistics of the PDB content
This directory contains the original PDB files for all entries of the benchmark as downloaded from https://zlab.umassmed.edu/benchmark/
And for each PDB there is an associated .info
file providing statistics of the PDB content
A few basic csh
scripts used to prepare the matched PDBs.
Manual intervention and checking was however required in several instances