Introduction

MOTIVATION: Protein structure alignment is key for transferring information from well-studied proteins to less studied ones. Structural alignment identifies the most precise mapping of equivalent residues, as structures are more conserved during evolution than sequences. Among the methods for aligning protein structures, maximum Contact Map Overlap (CMO) has received sustained attention during the past decade. Yet, known algorithms exhibit modest performance and are not applicable for large-scale comparison. RESULTS: Graphlets are small induced subgraphs that are used to design sensitive topological similarity measures between nodes and networks. By generalizing graphlets to ordered graphs, we introduce GR-Align, a CMO heuristic that is suited for database searches. On the Proteus_300 set (44 850 protein domain pairs), GR-Align is several orders of magnitude faster than the state-of-the-art CMO solvers Apurva, MSVNS and AlEigen7, and its similarity score is in better agreement with the structural classification of proteins. On a large-scale experiment on the Gold-standard benchmark dataset (3 207 270 protein domain pairs), GR-Align is several orders of magnitude faster than the state-of-the-art protein structure comparison tools TM-Align, DaliLite, MATT and Yakusa, while achieving similar classification performances. Finally, we illustrate the difference between GR-Align's flexible alignments and the traditional ones by querying a flexible protein in the Astral-40 database (11 154 protein domains). In this experiment, GR-Align's top scoring alignments are not only in better agreement with structural classification of proteins, but also that they allow transferring more information across proteins.

Publications

  1. GR-Align: fast and flexible alignment of protein 3D structures using graphlet degree similarity.
    Cite this
    Malod-Dognin N, Pržulj N, 2014-05-01 - Bioinformatics (Oxford, England)

Credits

  1. Noël Malod-Dognin
    Developer

    Department of Computing, Imperial College London

  2. Nataša Pržulj
    Investigator

Community Ratings

UsabilityEfficiencyReliabilityRated By
0 user
Sign in to rate
Summary
AccessionBT003855
Tool TypeApplication
Category
PlatformsLinux/Unix
TechnologiesC++
User InterfaceTerminal Command Line
Download Count0
Submitted ByNataša Pržulj