Introduction

The computational reconstruction of ancestral proteins provides information on past biological events and has practical implications for biomedicine and biotechnology. Currently available tools for ancestral sequence reconstruction (ASR) are often based on empirical amino acid substitution models that assume that all sites evolve at the same rate and under the same process. However, this assumption is frequently violated because protein evolution is highly heterogeneous due to different selective constraints among sites. Here, we present ProtASR, a new evolutionary framework to infer ancestral protein sequences accounting for selection on protein stability. First, ProtASR generates site-specific substitution matrices through the structurally constrained mean-field substitution model (MF), which considers both unfolding and misfolding stability. We previously showed that MF models outperform empirical amino acid substitution models, as well as other structurally constrained substitution models, both in terms of likelihood and correctly inferring amino acid distributions across sites. In the second step, ProtASR adapts a well-established maximum-likelihood (ML) ASR procedure to infer ancestral proteins under MF models. A known bias of ML ASR methods is that they tend to overestimate the stability of ancestral proteins by under-estimating the frequency of deleterious mutations. We compared ProtASR under MF to two empirical substitution models (JTT and CAT), reconstructing the ancestral sequences of simulated proteins. ProtASR yields reconstructed proteins with less biased stabilities, which are significantly closer to those of the simulated proteins. Analysis of extant protein families suggests that folding stability evolves through time across protein families, potentially reflecting neutral fluctuation. Some families exhibit a more constant protein folding stability, while others are more variable. ProtASR is freely available from https://github.com/miguelarenas/protasr and includes detailed documentation and ready-to-use examples. It runs in seconds/minutes depending on protein length and alignment size.

Publications

  1. ProtASR: An Evolutionary Framework for Ancestral Protein Reconstruction with Selection on Folding Stability.
    Cite this
    Arenas M, Weber CC, Liberles DA, Bastolla U, 2017-01-01 - Systematic biology

Credits

  1. Miguel Arenas
    Developer

    Department of Biochemistry, Genetics and Immunology, Spain

  2. Claudia C Weber
    Developer

    Department of Biology and Center for Computational Genetics and Genomics, Temple University, United States of America

  3. David A Liberles
    Developer

    Department of Biology and Center for Computational Genetics and Genomics, Temple University, United States of America

  4. Ugo Bastolla
    Investigator

    Centre for Molecular Biology Severo Ochoa (CBMSO), Consejo Superior de Investigaciones Científicas (CSIC), Spain

Community Ratings

UsabilityEfficiencyReliabilityRated By
0 user
Sign in to rate
Summary
AccessionBT002321
Tool TypeApplication
Category
PlatformsLinux/Unix
TechnologiesC, Perl
User InterfaceTerminal Command Line
Download Count0
Country/RegionSpain
Submitted ByUgo Bastolla