Introduction

Graphical models are often employed to interpret patterns of correlations observed in data through a network of interactions between the variables. Recently, Ising/Potts models, also known as Markov random fields, have been productively applied to diverse problems in biology, including the prediction of structural contacts from protein sequence data and the description of neural activity patterns. However, inference of such models is a challenging computational problem that cannot be solved exactly. Here, we describe the adaptive cluster expansion (ACE) method to quickly and accurately infer Ising or Potts models based on correlation data. ACE avoids overfitting by constructing a sparse network of interactions sufficient to reproduce the observed correlation data within the statistical error expected due to finite sampling. When convergence of the ACE algorithm is slow, we combine it with a Boltzmann Machine Learning algorithm (BML). We illustrate this method on a variety of biological and artificial datasets and compare it to state-of-the-art approximate methods such as Gaussian and pseudo-likelihood inference.We show that ACE accurately reproduces the true parameters of the underlying model when they are known, and yields accurate statistical descriptions of both biological and artificial data. Models inferred by ACE more accurately describe the statistics of the data, including both the constrained low-order correlations and unconstrained higher-order correlations, compared to those obtained by faster Gaussian and pseudo-likelihood methods. These alternative approaches can recover the structure of the interaction network but typically not the correct strength of interactions, resulting in less accurate generative models.The ACE source code, user manual and tutorials with the example data and filtered correlations described herein are freely available on GitHub at https://github.com/johnbarton/ACE CONTACTS: jpbarton@mit.edu, cocco@lps.ens.frSupplementary information: Supplementary data are available at Bioinformatics online.

Publications

  1. ACE: adaptive cluster expansion for maximum entropy graphical model inference.
    Cite this
    Barton JP, De Leonardis E, Coucke A, Cocco S, 2016-10-01 - Bioinformatics (Oxford, England)

Credits

  1. J P Barton
    Developer

    Departments of Chemical Engineering and Physics, Massachusetts Institute of Technology, United States of America

  2. E De Leonardis
    Developer

    Laboratoire de Physique Statistique de L'Ecole Normale Supérieure, CNRS, France

  3. A Coucke
    Developer

    Computational and Quantitative Biology, UPMC, France

  4. S Cocco
    Investigator

    Laboratoire de Physique Statistique de L'Ecole Normale Supérieure, CNRS, France

Community Ratings

UsabilityEfficiencyReliabilityRated By
0 user
Sign in to rate
Summary
AccessionBT000029
Tool TypeApplication
Category
PlatformsLinux/Unix
Technologies
User InterfaceTerminal Command Line
Download Count0
Country/RegionFrance
Submitted ByS Cocco