contact-filtering

DisVis-based filtering of contacts from co-evolution data (or other sources)

Here you can find the dataset described in the manuscript: Improving the Quality of Co-evolution Intermolecular Contact Prediction with DisVis Siri Camee van Keulen, Alexandre M.J.J. Bonvin

The content of this Github repository can be downloaded in its entirety from Zenodo:

1. Content

In this repository you can find 26 directories for each complex in the dataset. The name of each directory is composed of the PDB ID, the Green ID (see manuscript) and the number of true contacts within the top 10 according to Green et al. [1].

1.1. Directory Structure

PDBID_GreenID_contacts (e.g. 1FM0_allpdb0609_6)
│   GreenID_contacts_pdb_disvis_top20_10A.txt
│
└─── ana_scripts
│      
└─── p1_coevol_20restraints_0%
│      
└─── p2_coevol_20restraints_50%
|
└─── p3_coevol_10restraints_0%
|
└─── p4_disvis_10restraints_0%
|
└─── p5_coevol_10restraints_50%
|
└─── p6_disvis_10restraints_50%
|
└─── p7_coevol_5restraints_0%
|
└─── p8_disvis_5restraints_0%
|
└─── p9_disvis_20restraints_zscore_lt0_5_50%
|
└─── p10_disvis_20restraints_zscore_lt1_50%

1.1.2. DisVis Calculation for a Complex

In each complex folder, 10 directories and one file can be found. In this file (GreenID_contacts_pdb_disvis_top20_10A.txt) the top 20 contacts (excluding unresolved residue contacts) are described according to DisVis format.

DisVis Format Example

A 53 CA B 11 CA 0 10

Here a contact is described between the CA atom of residue 53 of chain A and the CA atom of residue 11 of chain B. The lower bound is 0 Angstrom and the upper bound is 10 Angstrom.

This file can be used together with the pdb files of the complex on the DisVis webserver [2] to calculate the z-score for each contact. The pdb files of each complex can be found in every protocol folder in the complex directory and are named by combining the GreenID, the number of true contacts within the top 10 according to Green et al. [1] and the chain ID (e.g. allpdb0609_6_A.pdb). Both pdb files for chain A and chain B are required to run the DisVis calculation.

1.1.3. Docking Protocols

Ten directories in each complex directory include the files to perform the protocols described in the manuscript. The numbering of the protocols is according to Table 2 in the manuscript.

The name of each protocol includes the protocol number according to the manuscript, contact method which was used to arrange the contacts in the distance restraint file (disvis or coevolution), the number of contacts included in the distance restraint file and the percentage of random removal for the contact list during docking (e.g. p6_disvis_10restraints_50%).

Inside each protocol the architecture is as follows:

ProtocolNumber_ContactMethod_NumberOfContacts_Removal (e.g p1_coevol_20restraints_50%)
│   GreenID_contacts_chainA.pdb
|   GreenID_contacts_chainB.pdb
│   ambig.tbl
|   hbonds.tbl
|   run.cns
|
└─── output

Protein Structures:
- GreenID_contacts_chainA.pdb coordinates for chain A
- GreenID_contacts_chainB.pdb coordinates for chain B
Distance restraints:
- ambig.tbl Ambiguous interaction restraints
- hbonds.tbl Unambiguous restraints defined to keep the chains together in case of chain break
Docking input:
- run.cns The HADDOCK parameter file defining the docking protocol and settings

The output directory includes selected docking output and output from the analysis scripts included in the ana_scripts directory.

ambig.tbl Used Ambiguous interaction restraints during docking
DockQ.dat List of DockQ output for all 200 models of itw also include the i-RMSD values
cluster.out cluster output list of the generated models in itw
file.list ranked models according to haddock itw score
clusters_haddock-sorted.stat_best4 ranked clusters according to haddock itw score
file_nam_clust{Cluster_Number}_best4 top 4 models for every cluster

1.1.4. Analysis Scripts for Each Protocol

Each complex directory includes an ana_scripts directory (see Section 1.1.). Here all scripts are made available to obtain the DockQ.dat and cluster files from the output directory.

target.pdb Reference complex structure with renumbered atoms and renamed chainID that matches HADDOCK output
cluster-fnat.csh fraction-native.csh i-rmsd_to_xray.csh l-rmsd_to_xray.csh run_all-no-it0.csh make-target-files.csh run_all.csh run_all-dockQ.csh run_dockQ.csh All required scripts for the analysis
target.contacts10 target.izoneA target.contacts5 target.izoneB target.izone target.lzone All required files for the analysis

2. Docking Results

the predicted 200 models for each protocol can be found on Zenodo:

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
1FM0_allpdb0609_6		1FM0_allpdb0609_6
1JB0_allpdb0058_6		1JB0_allpdb0058_6
1KF6_allpdb0234_4		1KF6_allpdb0234_4
1PG5_allpdb0696_2		1PG5_allpdb0696_2
1UB4_allpdb0732_8		1UB4_allpdb0732_8
2B3T_allpdb1938_2		2B3T_allpdb1938_2
2BS2_allpdb0461_4		2BS2_allpdb0461_4
2CZV_allpdb0803_6		2CZV_allpdb0803_6
2D1P_allpdb0190_8		2D1P_allpdb0190_8
2NV2_allpdb0854_8		2NV2_allpdb0854_8
2WDQ_allpdb0144_6		2WDQ_allpdb0144_6
2Y69_allpdb0089_10		2Y69_allpdb0089_10
3AYH_allpdb0972_2		3AYH_allpdb0972_2
3CR3_allpdb0993_2		3CR3_allpdb0993_2
3GLI_allpdb1822_10		3GLI_allpdb1822_10
3LPE_allpdb1080_4		3LPE_allpdb1080_4
3P5J_allpdb0336_4		3P5J_allpdb0336_4
3PNL_allpdb1128_8		3PNL_allpdb1128_8
3RKO_allpdb0153_10		3RKO_allpdb0153_10
3RLF_allpdb0211_4		3RLF_allpdb0211_4
4DL0_allpdb0367_6		4DL0_allpdb0367_6
4HEA_allpdb1728_10		4HEA_allpdb1728_10
5AWW_allpdb0550_9		5AWW_allpdb0550_9
5DOQ_allpdb2088_10		5DOQ_allpdb2088_10
5IFG_allpdb1601_8		5IFG_allpdb1601_8
5X3T_allpdb1682_2		5X3T_allpdb1682_2
LICENSE		LICENSE
README.md		README.md
banner.pdf		banner.pdf
banner.png		banner.png
table_2.png		table_2.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

contact-filtering

1. Content

1.1. Directory Structure

1.1.2. DisVis Calculation for a Complex

1.1.3. Docking Protocols

1.1.4. Analysis Scripts for Each Protocol

2. Docking Results

About

Releases 2

Packages

Contributors 2

Languages

License

haddocking/contact-filtering

Folders and files

Latest commit

History

Repository files navigation

contact-filtering

1. Content

1.1. Directory Structure

1.1.2. DisVis Calculation for a Complex

1.1.3. Docking Protocols

1.1.4. Analysis Scripts for Each Protocol

2. Docking Results

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2

Languages

Packages