DOCKGROUND


39956358	Dockground: The Resource Expands to Protein-RNA Interactome. [PMID: 39956358] Collins KW, Copeland MM, Kundrotas PJ, Vakser IA. Abstract RNA is a master regulator of cellular processes and will bind to many different proteins throughout its life cycle. Dysregulation of RNA and RNA-binding proteins can lead to various diseases, including cancer. To better understand molecular mechanisms of the cellular processes, it is important to characterize protein-RNA interactions at the structural level. There is a lack of experimental structures available for protein-RNA complexes due to the RNA inherent flexibility, which complicates the experimental structure determination. The scarcity of structures can be made up for with computational modeling. Dockground is a resource for development and benchmarking of structure-based modeling of protein interactions. It contains datasets focusing on different aspects of protein recognition. The foundation of all the datasets is the database of experimentally determined protein complexes, which previously contained only protein-protein assemblies. To further expand the utility of the Dockground resource, we extended the database to protein-RNA interactions. The new functionalities are available on the Dockground website at https://dockground.compbio.ku.edu/. The database can be searched using a number of criteria, including removal of redundancies at various sequence and structure similarity thresholds. The database updates with new structures from the Protein Data Bank on a weekly basis. J Mol Biol. 2025:437(15) \| 2 Citations (from Europe PMC, 2025-12-20)
36281025	Dockground resource for protein recognition studies. [PMID: 36281025] Keeley W Collins, Matthew M Copeland, Ian Kotthoff, Amar Singh, Petras J Kundrotas, Ilya A Vakser Abstract Structural information of protein-protein interactions is essential for characterization of life processes at the molecular level. While a small fraction of known protein interactions has experimentally determined structures, computational modeling of protein complexes (protein docking) has to fill the gap. The Dockground resource (http://dockground.compbio.ku.edu) provides a collection of datasets for the development and testing of protein docking techniques. Currently, Dockground contains datasets for the bound and the unbound (experimentally determined and simulated) protein structures, model-model complexes, docking decoys of experimentally determined and modeled proteins, and templates for comparative docking. The Dockground bound proteins dataset is a core set, from which other Dockground datasets are generated. It is devised as a relational PostgreSQL database containing information on experimentally determined protein-protein complexes. This report on the Dockground resource describes current status of the datasets, new automated update procedures and further development of the core datasets. We also present a new Dockground interactive web interface, which allows search by various parameters, such as release date, multimeric state, complex type, structure resolution, and so on, visualization of the search results with a number of customizable parameters, as well as downloadable datasets with predefined levels of sequence and structure redundancy. Protein Sci. 2022:31(12) \| 11 Citations (from Europe PMC, 2025-12-20)
35580077	DOCKGROUND membrane protein-protein set. [PMID: 35580077] Ian Kotthoff, Petras J Kundrotas, Ilya A Vakser Abstract Membrane proteins are significantly underrepresented in Protein Data Bank despite their essential role in cellular mechanisms and the major progress in experimental protein structure determination. Thus, computational approaches are especially valuable in the case of membrane proteins and their assemblies. The main focus in developing structure prediction techniques has been on soluble proteins, in part due to much greater availability of the structural data. Currently, structure prediction of protein complexes (protein docking) is a well-developed field of study. However, the generic protein docking approaches are not optimal for the membrane proteins because of the differences in physicochemical environment and the spatial constraints imposed by the membranes. Thus, docking of the membrane proteins requires specialized computational methods. Development and benchmarking of the membrane protein docking approaches has to be based on high-quality sets of membrane protein complexes. In this study we present a new dataset of 456 non-redundant alpha helical binary interfaces. The set is significantly larger and more representative than the previously developed sets. In the future, it will become the basis for the development of docking and scoring benchmarks, similar to the ones for soluble proteins in the Dockground resource http://dockground.compbio.ku.edu. PLoS One. 2022:17(5) \| 3 Citations (from Europe PMC, 2025-12-20)
32621232	Dockground Tool for Development and Benchmarking of Protein Docking Procedures. [PMID: 32621232] Petras J Kundrotas, Ian Kotthoff, Sherman W Choi, Matthew M Copeland, Ilya A Vakser Abstract Databases of protein-protein complexes are essential for the development of protein modeling/docking techniques. Such databases provide a knowledge base for docking algorithms, intermolecular potentials, search procedures, scoring functions, and refinement protocols. Development of docking techniques requires systematic validation of the modeling protocols on carefully curated benchmark sets of complexes. We present a description and a guide to the DOCKGROUND resource ( http://dockground.compbio.ku.edu ) for structural modeling of protein interactions. The resource integrates various datasets of protein complexes and other data for the development and testing of protein docking techniques. The sets include bound complexes, experimentally determined unbound, simulated unbound, model-model complexes, and docking decoys. The datasets are available to the user community through a Web interface. Methods Mol. Biol.. 2020:2165() \| 7 Citations (from Europe PMC, 2025-12-20)
28905425	Modeling CAPRI targets 110-120 by template-based and free docking using contact potential and combined scoring function. [PMID: 28905425] Kundrotas PJ, Anishchenko I, Badal VD, Das M, Dauzhenka T, Vakser IA. Abstract The paper presents analysis of our template-based and free docking predictions in the joint CASP12/CAPRI37 round. A new scoring function for template-based docking was developed, benchmarked on the Dockground resource, and applied to the targets. The results showed that the function successfully discriminates the incorrect docking predictions. In correctly predicted targets, the scoring function was complemented by other considerations, such as consistency of the oligomeric states among templates, similarity of the biological functions, biological interface relevance, etc. The scoring function still does not distinguish well biological from crystal packing interfaces, and needs further development for the docking of bundles of ?-helices. In the case of the trimeric targets, sequence-based methods did not find common templates, despite similarity of the structures, suggesting complementary use of structure- and sequence-based alignments in comparative docking. The results showed that if a good docking template is found, an accurate model of the interface can be built even from largely inaccurate models of individual subunits. Free docking however is very sensitive to the quality of the individual models. However, our newly developed contact potential detected approximate locations of the binding sites. Proteins. 2018:86 Suppl 1() \| 18 Citations (from Europe PMC, 2025-12-20)
28891124	Dockground: A comprehensive data resource for modeling of protein complexes. [PMID: 28891124] Kundrotas PJ, Anishchenko I, Dauzhenka T, Kotthoff I, Mnevets D, Copeland MM, Vakser IA. Abstract Characterization of life processes at the molecular level requires structural details of protein interactions. The number of experimentally determined structures of protein-protein complexes accounts only for a fraction of known protein interactions. This gap in structural description of the interactome has to be bridged by modeling. An essential part of the development of structural modeling/docking techniques for protein interactions is databases of protein-protein complexes. They are necessary for studying protein interfaces, providing a knowledge base for docking algorithms, and developing intermolecular potentials, search procedures, and scoring functions. Development of protein-protein docking techniques requires thorough benchmarking of different parts of the docking protocols on carefully curated sets of protein-protein complexes. We present a comprehensive description of the Dockground resource (http://dockground.compbio.ku.edu) for structural modeling of protein interactions, including previously unpublished unbound docking benchmark set 4, and the X-ray docking decoy set 2. The resource offers a variety of interconnected datasets of protein-protein complexes and other data for the development and testing of different aspects of protein docking methodologies. Based on protein-protein complexes extracted from the PDB biounit files, Dockground offers sets of X-ray unbound, simulated unbound, model, and docking decoy structures. All datasets are freely available for download, as a whole or selecting specific structures, through a user-friendly interface on one integrated website. Protein Sci. 2018:27(1) \| 65 Citations (from Europe PMC, 2025-12-20)
25712716	Protein models docking benchmark 2. [PMID: 25712716] Anishchenko I, Kundrotas PJ, Tuzikov AV, Vakser IA. Abstract Structural characterization of protein-protein interactions is essential for our ability to understand life processes. However, only a fraction of known proteins have experimentally determined structures. Such structures provide templates for modeling of a large part of the proteome, where individual proteins can be docked by template-free or template-based techniques. Still, the sensitivity of the docking methods to the inherent inaccuracies of protein models, as opposed to the experimentally determined high-resolution structures, remains largely untested, primarily due to the absence of appropriate benchmark set(s). Structures in such a set should have predefined inaccuracy levels and, at the same time, resemble actual protein models in terms of structural motifs/packing. The set should also be large enough to ensure statistical reliability of the benchmarking results. We present a major update of the previously developed benchmark set of protein models. For each interactor, six models were generated with the model-to-native C(?) RMSD in the 1 to 6 Å range. The models in the set were generated by a new approach, which corresponds to the actual modeling of new protein structures in the "real case scenario," as opposed to the previous set, where a significant number of structures were model-like only. In addition, the larger number of complexes (165 vs. 63 in the previous set) increases the statistical reliability of the benchmarking. We estimated the highest accuracy of the predicted complexes (according to CAPRI criteria), which can be attained using the benchmark structures. The set is available at http://dockground.bioinformatics.ku.edu. Proteins. 2015:83(5) \| 16 Citations (from Europe PMC, 2025-12-20)
26227548	Simulated unbound structures for benchmarking of protein docking in the DOCKGROUND resource. [PMID: 26227548] Kirys T, Ruvinsky AM, Singla D, Tuzikov AV, Kundrotas PJ, Vakser IA. Abstract BACKGROUND: Proteins play an important role in biological processes in living organisms. Many protein functions are based on interaction with other proteins. The structural information is important for adequate description of these interactions. Sets of protein structures determined in both bound and unbound states are essential for benchmarking of the docking procedures. However, the number of such proteins in PDB is relatively small. A radical expansion of such sets is possible if the unbound structures are computationally simulated. RESULTS: The DOCKGROUND public resource provides data to improve our understanding of protein-protein interactions and to assist in the development of better tools for structural modeling of protein complexes, such as docking algorithms and scoring functions. A large set of simulated unbound protein structures was generated from the bound structures. The modeling protocol was based on 1 ns Langevin dynamics simulation. The simulated structures were validated on the ensemble of experimentally determined unbound and bound structures. The set is intended for large scale benchmarking of docking algorithms and scoring functions. CONCLUSIONS: A radical expansion of the unbound protein docking benchmark set was achieved by simulating the unbound structures. The simulated unbound structures were selected according to criteria from systematic comparison of experimentally determined bound and unbound structures. The set is publicly available at http://dockground.compbio.ku.edu. BMC Bioinformatics. 2015:16() \| 9 Citations (from Europe PMC, 2025-12-20)
21745398	DECK: Distance and environment-dependent, coarse-grained, knowledge-based potentials for protein-protein docking. [PMID: 21745398] Liu S, Vakser IA. Abstract BACKGROUND: Computational approaches to protein-protein docking typically include scoring aimed at improving the rank of the near-native structure relative to the false-positive matches. Knowledge-based potentials improve modeling of protein complexes by taking advantage of the rapidly increasing amount of experimentally derived information on protein-protein association. An essential element of knowledge-based potentials is defining the reference state for an optimal description of the residue-residue (or atom-atom) pairs in the non-interaction state. RESULTS: The study presents a new Distance- and Environment-dependent, Coarse-grained, Knowledge-based (DECK) potential for scoring of protein-protein docking predictions. Training sets of protein-protein matches were generated based on bound and unbound forms of proteins taken from the DOCKGROUND resource. Each residue was represented by a pseudo-atom in the geometric center of the side chain. To capture the long-range and the multi-body interactions, residues in different secondary structure elements at protein-protein interfaces were considered as different residue types. Five reference states for the potentials were defined and tested. The optimal reference state was selected and the cutoff effect on the distance-dependent potentials investigated. The potentials were validated on the docking decoys sets, showing better performance than the existing potentials used in scoring of protein-protein docking results. CONCLUSIONS: A novel residue-based statistical potential for protein-protein docking was developed and validated on docking decoy sets. The results show that the scoring function DECK can successfully identify near-native protein-protein matches and thus is useful in protein docking. In addition to the practical application of the potentials, the study provides insights into the relative utility of the reference states, the scope of the distance dependence, and the coarse-graining of the potentials. BMC Bioinformatics. 2011:12() \| 42 Citations (from Europe PMC, 2025-12-20)
20715056	Docking by structural similarity at protein-protein interfaces. [PMID: 20715056] Sinha R, Kundrotas PJ, Vakser IA. Abstract Rapid accumulation of experimental data on protein-protein complexes drives the paradigm shift in protein docking from "traditional," template free approaches to template based techniques. Homology docking algorithms based on sequence similarity between target and template complexes can account for up to 20% of known protein-protein interactions. When highly homologous templates for the target complex are not available, but the structure of the target monomers is known, docking by local structural alignment may provide an adequate solution. Such an algorithm was developed based on the structural comparison of monomers to cocrystallized interfaces. A library of the interfaces was generated from cocrystallized protein-protein complexes in PDB. The partial structure alignment algorithm was validated on the DOCKGROUND benchmark sets. The optimal performance of the partial (interface) structure alignment was achieved with the interface residues defined by 12 Å distance across the interface. Overall, the partial structure alignment yielded more accurate models than the full structure alignment. Most templates identified by the partial structure alignment had low sequence identity to the target, which makes them hard to detect by sequence-based methods. The results indicate that the structure alignment techniques provide a much needed addition to the docking arsenal, with the combined structure alignment and template free docking success rate significantly surpassing that of the free docking alone. Proteins. 2010:78(15) \| 73 Citations (from Europe PMC, 2025-12-20)
18812365	DOCKGROUND protein-protein docking decoy set. [PMID: 18812365] Liu S, Gao Y, Vakser IA. Abstract A protein-protein docking decoy set is built for the Dockground unbound benchmark set. The GRAMM-X docking scan was used to generate 100 non-native and at least one near-native match per complex for 61 complexes. The set is a publicly available resource for the development of scoring functions and knowledge-based potentials for protein docking methodologies. AVAILABILITY: The decoys are freely available for download at http://dockground.bioinformatics.ku.edu/UNBOUND/decoy/decoy.php Bioinformatics. 2008:24(22) \| 61 Citations (from Europe PMC, 2025-12-20)
18763743	Large-scale structural modeling of protein complexes at low resolution. [PMID: 18763743] Zhu Z, Tovchigrechko A, Baronova T, Gao Y, Douguet D, O'Toole N, Vakser IA. Abstract Structural aspects of protein-protein interactions provided by large-scale, genome-wide studies are essential for the description of life processes at the molecular level. A methodology is developed that applies the protein docking approach (GRAMM), based on the knowledge of experimentally determined protein-protein structures (DOCKGROUND resource) and properties of intermolecular energy landscapes, to genome-wide systems of protein interactions. The full sequence-to-structure-of-complex modeling pipeline is implemented in the Genome Wide Docking Database (GWIDD) resource. Protein interaction data are imported to GWIDD from external datasets of experimentally determined interaction networks. Essential information is extracted and unified to form the GWIDD database. Structures of individual interacting proteins in the database are retrieved (if available) or modeled, and protein complex structures are predicted by the docking program. All protein sequence, structure, and docking information is conveniently accessible through a Web interface. J Bioinform Comput Biol. 2008:6(4) \| 4 Citations (from Europe PMC, 2025-12-20)
17803215	DOCKGROUND system of databases for protein recognition studies: unbound structures for docking. [PMID: 17803215] Gao Y, Douguet D, Tovchigrechko A, Vakser IA. Abstract Computational docking approaches are important as a source of protein-protein complexes structures and as a means to understand the principles of protein association. A key element in designing better docking approaches, including search procedures, potentials, and scoring functions is their validation on experimentally determined structures. Thus, the databases of such structures (benchmark sets) are important. The previous, first release of the DOCKGROUND resource (Douguet et al., Bioinformatics 2006; 22:2612-2618) implemented a comprehensive database of cocrystallized (bound) protein-protein complexes in a relational database of annotated structures. The current release adds important features to the set of bound structures, such as regularly updated downloadable datasets: automatically generated nonredundant set, built according to most common criteria, and a manually curated set that includes only biological nonobligate complexes along with a number of additional useful characteristics. The main focus of the current release is unbound (experimental and simulated) protein-protein complexes. Complexes from the bound dataset are used to identify crystallized unbound analogs. If such analogs do not exist, the unbound structures are simulated by rotamer library optimization. Thus, the database contains comprehensive sets of complexes suitable for large scale benchmarking of docking algorithms. Advanced methodologies for simulating unbound conformations are being explored for the next release. The future releases will include datasets of modeled protein-protein complexes, and systematic sets of docking decoys obtained by different docking algorithms. The growing DOCKGROUND resource is designed to become a comprehensive public environment for developing and validating new docking methodologies. Proteins. 2007:69(4) \| 62 Citations (from Europe PMC, 2025-12-20)
16928732	DOCKGROUND resource for studying protein-protein interfaces. [PMID: 16928732] Douguet D, Chen HC, Tovchigrechko A, Vakser IA. Abstract MOTIVATION: Public resources for studying protein interfaces are necessary for better understanding of molecular recognition and developing intermolecular potentials, search procedures and scoring functions for the prediction of protein complexes. RESULTS: The first release of the DOCKGROUND resource implements a comprehensive database of co-crystallized (bound-bound) protein-protein complexes, providing foundation for the upcoming expansion to unbound (experimental and simulated) protein-protein complexes, modeled protein-protein complexes and systematic sets of docking decoys. The bound-bound part of DOCKGROUND is a relational database of annotated structures based on the Biological Unit file (Biounit) provided by the RCSB as a separated file containing probable biological molecule. DOCKGROUND is automatically updated to reflect the growth of PDB. It contains 67,220 pairwise complexes that rely on 14,913 Biounit entries from 34,778 PDB entries (January 30, 2006). The database includes a dynamic generation of non-redundant datasets of pairwise complexes based either on the structural similarity (SCOP classification) or on user-defined sequence identity. The growing DOCKGROUND resource is designed to become a comprehensive public environment for developing and validating new methodologies for modeling of protein interactions. AVAILABILITY: DOCKGROUND is available at http://dockground.bioinformatics.ku.edu. The current first release implements the bound-bound part. Bioinformatics. 2006:22(21) \| 68 Citations (from Europe PMC, 2025-12-20)

Dockground: The Resource Expands to Protein-RNA Interactome. [PMID: 39956358]

Collins KW, Copeland MM, Kundrotas PJ, Vakser IA.

RNA is a master regulator of cellular processes and will bind to many different proteins throughout its life cycle. Dysregulation of RNA and RNA-binding proteins can lead to various diseases, including cancer. To better understand molecular mechanisms of the cellular processes, it is important to characterize protein-RNA interactions at the structural level. There is a lack of experimental structures available for protein-RNA complexes due to the RNA inherent flexibility, which complicates the experimental structure determination. The scarcity of structures can be made up for with computational modeling. Dockground is a resource for development and benchmarking of structure-based modeling of protein interactions. It contains datasets focusing on different aspects of protein recognition. The foundation of all the datasets is the database of experimentally determined protein complexes, which previously contained only protein-protein assemblies. To further expand the utility of the Dockground resource, we extended the database to protein-RNA interactions. The new functionalities are available on the Dockground website at https://dockground.compbio.ku.edu/. The database can be searched using a number of criteria, including removal of redundancies at various sequence and structure similarity thresholds. The database updates with new structures from the Protein Data Bank on a weekly basis.

J Mol Biol. 2025:437(15) | 2 Citations (from Europe PMC, 2025-12-20)

Dockground resource for protein recognition studies. [PMID: 36281025]

Keeley W Collins, Matthew M Copeland, Ian Kotthoff, Amar Singh, Petras J Kundrotas, Ilya A Vakser

Abstract

Structural information of protein-protein interactions is essential for characterization of life processes at the molecular level. While a small fraction of known protein interactions has experimentally determined structures, computational modeling of protein complexes (protein docking) has to fill the gap. The Dockground resource (http://dockground.compbio.ku.edu) provides a collection of datasets for the development and testing of protein docking techniques. Currently, Dockground contains datasets for the bound and the unbound (experimentally determined and simulated) protein structures, model-model complexes, docking decoys of experimentally determined and modeled proteins, and templates for comparative docking. The Dockground bound proteins dataset is a core set, from which other Dockground datasets are generated. It is devised as a relational PostgreSQL database containing information on experimentally determined protein-protein complexes. This report on the Dockground resource describes current status of the datasets, new automated update procedures and further development of the core datasets. We also present a new Dockground interactive web interface, which allows search by various parameters, such as release date, multimeric state, complex type, structure resolution, and so on, visualization of the search results with a number of customizable parameters, as well as downloadable datasets with predefined levels of sequence and structure redundancy.

Protein Sci. 2022:31(12) | 11 Citations (from Europe PMC, 2025-12-20)

DOCKGROUND membrane protein-protein set. [PMID: 35580077]

Ian Kotthoff, Petras J Kundrotas, Ilya A Vakser

Abstract

Membrane proteins are significantly underrepresented in Protein Data Bank despite their essential role in cellular mechanisms and the major progress in experimental protein structure determination. Thus, computational approaches are especially valuable in the case of membrane proteins and their assemblies. The main focus in developing structure prediction techniques has been on soluble proteins, in part due to much greater availability of the structural data. Currently, structure prediction of protein complexes (protein docking) is a well-developed field of study. However, the generic protein docking approaches are not optimal for the membrane proteins because of the differences in physicochemical environment and the spatial constraints imposed by the membranes. Thus, docking of the membrane proteins requires specialized computational methods. Development and benchmarking of the membrane protein docking approaches has to be based on high-quality sets of membrane protein complexes. In this study we present a new dataset of 456 non-redundant alpha helical binary interfaces. The set is significantly larger and more representative than the previously developed sets. In the future, it will become the basis for the development of docking and scoring benchmarks, similar to the ones for soluble proteins in the Dockground resource http://dockground.compbio.ku.edu.

PLoS One. 2022:17(5) | 3 Citations (from Europe PMC, 2025-12-20)

Dockground Tool for Development and Benchmarking of Protein Docking Procedures. [PMID: 32621232]

Petras J Kundrotas, Ian Kotthoff, Sherman W Choi, Matthew M Copeland, Ilya A Vakser

Abstract

Databases of protein-protein complexes are essential for the development of protein modeling/docking techniques. Such databases provide a knowledge base for docking algorithms, intermolecular potentials, search procedures, scoring functions, and refinement protocols. Development of docking techniques requires systematic validation of the modeling protocols on carefully curated benchmark sets of complexes. We present a description and a guide to the DOCKGROUND resource ( http://dockground.compbio.ku.edu ) for structural modeling of protein interactions. The resource integrates various datasets of protein complexes and other data for the development and testing of protein docking techniques. The sets include bound complexes, experimentally determined unbound, simulated unbound, model-model complexes, and docking decoys. The datasets are available to the user community through a Web interface.

Methods Mol. Biol.. 2020:2165() | 7 Citations (from Europe PMC, 2025-12-20)

Modeling CAPRI targets 110-120 by template-based and free docking using contact potential and combined scoring function. [PMID: 28905425]

Kundrotas PJ, Anishchenko I, Badal VD, Das M, Dauzhenka T, Vakser IA.

Abstract

The paper presents analysis of our template-based and free docking predictions in the joint CASP12/CAPRI37 round. A new scoring function for template-based docking was developed, benchmarked on the Dockground resource, and applied to the targets. The results showed that the function successfully discriminates the incorrect docking predictions. In correctly predicted targets, the scoring function was complemented by other considerations, such as consistency of the oligomeric states among templates, similarity of the biological functions, biological interface relevance, etc. The scoring function still does not distinguish well biological from crystal packing interfaces, and needs further development for the docking of bundles of ?-helices. In the case of the trimeric targets, sequence-based methods did not find common templates, despite similarity of the structures, suggesting complementary use of structure- and sequence-based alignments in comparative docking. The results showed that if a good docking template is found, an accurate model of the interface can be built even from largely inaccurate models of individual subunits. Free docking however is very sensitive to the quality of the individual models. However, our newly developed contact potential detected approximate locations of the binding sites.

Proteins. 2018:86 Suppl 1() | 18 Citations (from Europe PMC, 2025-12-20)

Dockground: A comprehensive data resource for modeling of protein complexes. [PMID: 28891124]

Kundrotas PJ, Anishchenko I, Dauzhenka T, Kotthoff I, Mnevets D, Copeland MM, Vakser IA.

Abstract

Characterization of life processes at the molecular level requires structural details of protein interactions. The number of experimentally determined structures of protein-protein complexes accounts only for a fraction of known protein interactions. This gap in structural description of the interactome has to be bridged by modeling. An essential part of the development of structural modeling/docking techniques for protein interactions is databases of protein-protein complexes. They are necessary for studying protein interfaces, providing a knowledge base for docking algorithms, and developing intermolecular potentials, search procedures, and scoring functions. Development of protein-protein docking techniques requires thorough benchmarking of different parts of the docking protocols on carefully curated sets of protein-protein complexes. We present a comprehensive description of the Dockground resource (http://dockground.compbio.ku.edu) for structural modeling of protein interactions, including previously unpublished unbound docking benchmark set 4, and the X-ray docking decoy set 2. The resource offers a variety of interconnected datasets of protein-protein complexes and other data for the development and testing of different aspects of protein docking methodologies. Based on protein-protein complexes extracted from the PDB biounit files, Dockground offers sets of X-ray unbound, simulated unbound, model, and docking decoy structures. All datasets are freely available for download, as a whole or selecting specific structures, through a user-friendly interface on one integrated website.

Protein Sci. 2018:27(1) | 65 Citations (from Europe PMC, 2025-12-20)

Protein models docking benchmark 2. [PMID: 25712716]

Anishchenko I, Kundrotas PJ, Tuzikov AV, Vakser IA.

Abstract

Structural characterization of protein-protein interactions is essential for our ability to understand life processes. However, only a fraction of known proteins have experimentally determined structures. Such structures provide templates for modeling of a large part of the proteome, where individual proteins can be docked by template-free or template-based techniques. Still, the sensitivity of the docking methods to the inherent inaccuracies of protein models, as opposed to the experimentally determined high-resolution structures, remains largely untested, primarily due to the absence of appropriate benchmark set(s). Structures in such a set should have predefined inaccuracy levels and, at the same time, resemble actual protein models in terms of structural motifs/packing. The set should also be large enough to ensure statistical reliability of the benchmarking results. We present a major update of the previously developed benchmark set of protein models. For each interactor, six models were generated with the model-to-native C(?) RMSD in the 1 to 6 Å range. The models in the set were generated by a new approach, which corresponds to the actual modeling of new protein structures in the "real case scenario," as opposed to the previous set, where a significant number of structures were model-like only. In addition, the larger number of complexes (165 vs. 63 in the previous set) increases the statistical reliability of the benchmarking. We estimated the highest accuracy of the predicted complexes (according to CAPRI criteria), which can be attained using the benchmark structures. The set is available at http://dockground.bioinformatics.ku.edu.

Proteins. 2015:83(5) | 16 Citations (from Europe PMC, 2025-12-20)

Simulated unbound structures for benchmarking of protein docking in the DOCKGROUND resource. [PMID: 26227548]

Kirys T, Ruvinsky AM, Singla D, Tuzikov AV, Kundrotas PJ, Vakser IA.

Abstract

BACKGROUND: Proteins play an important role in biological processes in living organisms. Many protein functions are based on interaction with other proteins. The structural information is important for adequate description of these interactions. Sets of protein structures determined in both bound and unbound states are essential for benchmarking of the docking procedures. However, the number of such proteins in PDB is relatively small. A radical expansion of such sets is possible if the unbound structures are computationally simulated.
RESULTS: The DOCKGROUND public resource provides data to improve our understanding of protein-protein interactions and to assist in the development of better tools for structural modeling of protein complexes, such as docking algorithms and scoring functions. A large set of simulated unbound protein structures was generated from the bound structures. The modeling protocol was based on 1 ns Langevin dynamics simulation. The simulated structures were validated on the ensemble of experimentally determined unbound and bound structures. The set is intended for large scale benchmarking of docking algorithms and scoring functions.
CONCLUSIONS: A radical expansion of the unbound protein docking benchmark set was achieved by simulating the unbound structures. The simulated unbound structures were selected according to criteria from systematic comparison of experimentally determined bound and unbound structures. The set is publicly available at http://dockground.compbio.ku.edu.

BMC Bioinformatics. 2015:16() | 9 Citations (from Europe PMC, 2025-12-20)

DECK: Distance and environment-dependent, coarse-grained, knowledge-based potentials for protein-protein docking. [PMID: 21745398]

Liu S, Vakser IA.

Abstract

BACKGROUND: Computational approaches to protein-protein docking typically include scoring aimed at improving the rank of the near-native structure relative to the false-positive matches. Knowledge-based potentials improve modeling of protein complexes by taking advantage of the rapidly increasing amount of experimentally derived information on protein-protein association. An essential element of knowledge-based potentials is defining the reference state for an optimal description of the residue-residue (or atom-atom) pairs in the non-interaction state.
RESULTS: The study presents a new Distance- and Environment-dependent, Coarse-grained, Knowledge-based (DECK) potential for scoring of protein-protein docking predictions. Training sets of protein-protein matches were generated based on bound and unbound forms of proteins taken from the DOCKGROUND resource. Each residue was represented by a pseudo-atom in the geometric center of the side chain. To capture the long-range and the multi-body interactions, residues in different secondary structure elements at protein-protein interfaces were considered as different residue types. Five reference states for the potentials were defined and tested. The optimal reference state was selected and the cutoff effect on the distance-dependent potentials investigated. The potentials were validated on the docking decoys sets, showing better performance than the existing potentials used in scoring of protein-protein docking results.
CONCLUSIONS: A novel residue-based statistical potential for protein-protein docking was developed and validated on docking decoy sets. The results show that the scoring function DECK can successfully identify near-native protein-protein matches and thus is useful in protein docking. In addition to the practical application of the potentials, the study provides insights into the relative utility of the reference states, the scope of the distance dependence, and the coarse-graining of the potentials.

BMC Bioinformatics. 2011:12() | 42 Citations (from Europe PMC, 2025-12-20)

Docking by structural similarity at protein-protein interfaces. [PMID: 20715056]

Sinha R, Kundrotas PJ, Vakser IA.

Abstract

Rapid accumulation of experimental data on protein-protein complexes drives the paradigm shift in protein docking from "traditional," template free approaches to template based techniques. Homology docking algorithms based on sequence similarity between target and template complexes can account for up to 20% of known protein-protein interactions. When highly homologous templates for the target complex are not available, but the structure of the target monomers is known, docking by local structural alignment may provide an adequate solution. Such an algorithm was developed based on the structural comparison of monomers to cocrystallized interfaces. A library of the interfaces was generated from cocrystallized protein-protein complexes in PDB. The partial structure alignment algorithm was validated on the DOCKGROUND benchmark sets. The optimal performance of the partial (interface) structure alignment was achieved with the interface residues defined by 12 Å distance across the interface. Overall, the partial structure alignment yielded more accurate models than the full structure alignment. Most templates identified by the partial structure alignment had low sequence identity to the target, which makes them hard to detect by sequence-based methods. The results indicate that the structure alignment techniques provide a much needed addition to the docking arsenal, with the combined structure alignment and template free docking success rate significantly surpassing that of the free docking alone.

Proteins. 2010:78(15) | 73 Citations (from Europe PMC, 2025-12-20)

DOCKGROUND protein-protein docking decoy set. [PMID: 18812365]

Liu S, Gao Y, Vakser IA.

Abstract

A protein-protein docking decoy set is built for the Dockground unbound benchmark set. The GRAMM-X docking scan was used to generate 100 non-native and at least one near-native match per complex for 61 complexes. The set is a publicly available resource for the development of scoring functions and knowledge-based potentials for protein docking methodologies.
AVAILABILITY: The decoys are freely available for download at http://dockground.bioinformatics.ku.edu/UNBOUND/decoy/decoy.php

Bioinformatics. 2008:24(22) | 61 Citations (from Europe PMC, 2025-12-20)

Large-scale structural modeling of protein complexes at low resolution. [PMID: 18763743]

Zhu Z, Tovchigrechko A, Baronova T, Gao Y, Douguet D, O'Toole N, Vakser IA.

Abstract

Structural aspects of protein-protein interactions provided by large-scale, genome-wide studies are essential for the description of life processes at the molecular level. A methodology is developed that applies the protein docking approach (GRAMM), based on the knowledge of experimentally determined protein-protein structures (DOCKGROUND resource) and properties of intermolecular energy landscapes, to genome-wide systems of protein interactions. The full sequence-to-structure-of-complex modeling pipeline is implemented in the Genome Wide Docking Database (GWIDD) resource. Protein interaction data are imported to GWIDD from external datasets of experimentally determined interaction networks. Essential information is extracted and unified to form the GWIDD database. Structures of individual interacting proteins in the database are retrieved (if available) or modeled, and protein complex structures are predicted by the docking program. All protein sequence, structure, and docking information is conveniently accessible through a Web interface.

J Bioinform Comput Biol. 2008:6(4) | 4 Citations (from Europe PMC, 2025-12-20)

DOCKGROUND system of databases for protein recognition studies: unbound structures for docking. [PMID: 17803215]

Gao Y, Douguet D, Tovchigrechko A, Vakser IA.

Abstract

Computational docking approaches are important as a source of protein-protein complexes structures and as a means to understand the principles of protein association. A key element in designing better docking approaches, including search procedures, potentials, and scoring functions is their validation on experimentally determined structures. Thus, the databases of such structures (benchmark sets) are important. The previous, first release of the DOCKGROUND resource (Douguet et al., Bioinformatics 2006; 22:2612-2618) implemented a comprehensive database of cocrystallized (bound) protein-protein complexes in a relational database of annotated structures. The current release adds important features to the set of bound structures, such as regularly updated downloadable datasets: automatically generated nonredundant set, built according to most common criteria, and a manually curated set that includes only biological nonobligate complexes along with a number of additional useful characteristics. The main focus of the current release is unbound (experimental and simulated) protein-protein complexes. Complexes from the bound dataset are used to identify crystallized unbound analogs. If such analogs do not exist, the unbound structures are simulated by rotamer library optimization. Thus, the database contains comprehensive sets of complexes suitable for large scale benchmarking of docking algorithms. Advanced methodologies for simulating unbound conformations are being explored for the next release. The future releases will include datasets of modeled protein-protein complexes, and systematic sets of docking decoys obtained by different docking algorithms. The growing DOCKGROUND resource is designed to become a comprehensive public environment for developing and validating new docking methodologies.

Proteins. 2007:69(4) | 62 Citations (from Europe PMC, 2025-12-20)

DOCKGROUND resource for studying protein-protein interfaces. [PMID: 16928732]

Douguet D, Chen HC, Tovchigrechko A, Vakser IA.

Abstract

MOTIVATION: Public resources for studying protein interfaces are necessary for better understanding of molecular recognition and developing intermolecular potentials, search procedures and scoring functions for the prediction of protein complexes.
RESULTS: The first release of the DOCKGROUND resource implements a comprehensive database of co-crystallized (bound-bound) protein-protein complexes, providing foundation for the upcoming expansion to unbound (experimental and simulated) protein-protein complexes, modeled protein-protein complexes and systematic sets of docking decoys. The bound-bound part of DOCKGROUND is a relational database of annotated structures based on the Biological Unit file (Biounit) provided by the RCSB as a separated file containing probable biological molecule. DOCKGROUND is automatically updated to reflect the growth of PDB. It contains 67,220 pairwise complexes that rely on 14,913 Biounit entries from 34,778 PDB entries (January 30, 2006). The database includes a dynamic generation of non-redundant datasets of pairwise complexes based either on the structural similarity (SCOP classification) or on user-defined sequence identity. The growing DOCKGROUND resource is designed to become a comprehensive public environment for developing and validating new methodologies for modeling of protein interactions.
AVAILABILITY: DOCKGROUND is available at http://dockground.bioinformatics.ku.edu. The current first release implements the bound-bound part.

Bioinformatics. 2006:22(21) | 68 Citations (from Europe PMC, 2025-12-20)

URL:	http://dockground.compbio.ku.edu
Full name:	DOCKGROUND
Description:	Adequate computational techniques for modeling of protein interactions are important because of the growing number of known protein 3D structures, particularly in the context of structural genomics. Dockground project is designed to provide resources for the development of such techniques as well as increase our knowledge of protein interfaces. Dockground datasets are regularly updated and annotated.
Year founded:	2006
Last update:
Version:
Accessibility:	Accessible
Country/Region:	United States

Data type:	Protein
Data object:	NA
Database category:	Interaction
Major species:	NA
Keywords:	DOCKING BENCHMARKS DOCKING DECOYS DOCKING TEMPLATES protein-protein interaction

University/Institution:	University of Kansas
Address:	Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66045
City:	Lawrence
Province/State:	Kansas
Country/Region:	United States
Contact name (PI/Team):	Petras Kundrotas
Contact email (PI/Helpdesk):	dockground@ku.edu

Database Commons
a catalog of worldwide biological databases

a catalog of worldwide biological databases

Database Profile

General information

Classification & Tag

Contact information

Publications

Ranking

Community reviews

Word cloud

Tags

Related Databases

Record metadata

Database Commons a catalog of worldwide biological databases

a catalog of worldwide biological databases

Database Profile

DOCKGROUND

General information

Classification & Tag

Contact information

Publications

Ranking

Community reviews

Word cloud

Tags

Related Databases

Record metadata

Database Commons
a catalog of worldwide biological databases