Rfam


39526405	Rfam 15: RNA families database in 2025. [PMID: 39526405] Ontiveros-Palacios N, Cooke E, Nawrocki EP, Triebel S, Marz M, Rivas E, Griffiths-Jones S, Petrov AI, Bateman A, Sweeney B. Abstract The Rfam database, a widely used repository of non-coding RNA families, has undergone significant updates in release 15.0. This paper introduces major improvements, including the expansion of Rfamseq to 26 106 genomes, a 76% increase, incorporating the latest UniProt reference proteomes and additional viral genomes. Sixty-five RNA families were enhanced using experimentally determined 3D structures, improving the accuracy of consensus secondary structures and annotations. R-scape covariation analysis was used to refine structural predictions in 26 families. Gene Ontology (GO) and Sequence Ontology annotations were comprehensively updated, increasing GO term coverage to 75% of families. The release adds 14 new Hepatitis C Virus RNA families and completes microRNA family synchronization with miRBase, resulting in 1603 microRNA families. New data types, including FULL alignments, have been implemented. Integration with APICURON for improved curator attribution and multiple website enhancements further improve user experience. These updates significantly expand Rfam's coverage and improve annotation quality, reinforcing its critical role in RNA research, genome annotation and the development of machine learning models. Rfam is freely available at https://rfam.org. Nucleic Acids Res. 2025:53(D1) \| 56 Citations (from Europe PMC, 2025-12-13)
33211869	Rfam 14: expanded coverage of metagenomic, viral and microRNA families. [PMID: 33211869] Kalvari I, Nawrocki EP, Ontiveros-Palacios N, Argasinska J, Lamkiewicz K, Marz M, Griffiths-Jones S, Toffano-Nioche C, Gautheret D, Weinberg Z, Rivas E, Eddy SR, Finn RD, Bateman A, Petrov AI. Abstract Rfam is a database of RNA families where each of the 3444 families is represented by a multiple sequence alignment of known RNA sequences and a covariance model that can be used to search for additional members of the family. Recent developments have involved expert collaborations to improve the quality and coverage of Rfam data, focusing on microRNAs, viral and bacterial RNAs. We have completed the first phase of synchronising microRNA families in Rfam and miRBase, creating 356 new Rfam families and updating 40. We established a procedure for comprehensive annotation of viral RNA families starting with Flavivirus and Coronaviridae RNAs. We have also increased the coverage of bacterial and metagenome-based RNA families from the ZWD database. These developments have enabled a significant growth of the database, with the addition of 759 new families in Rfam 14. To facilitate further community contribution to Rfam, expert users are now able to build and submit new families using the newly developed Rfam Cloud family curation system. New Rfam website features include a new sequence similarity search powered by RNAcentral, as well as search and visualisation of families with pseudoknots. Rfam is freely available at https://rfam.org. Nucleic Acids Res. 2021:49(D1) \| 729 Citations (from Europe PMC, 2025-12-13)
29112718	Rfam 13.0: shifting to a genome-centric resource for non-coding RNA families. [PMID: 29112718] Kalvari I, Argasinska J, Quinones-Olvera N, Nawrocki EP, Rivas E, Eddy SR, Bateman A, Finn RD, Petrov AI. Abstract The Rfam database is a collection of RNA families in which each family is represented by a multiple sequence alignment, a consensus secondary structure, and a covariance model. In this paper we introduce Rfam release 13.0, which switches to a new genome-centric approach that annotates a non-redundant set of reference genomes with RNA families. We describe new web interface features including faceted text search and R-scape secondary structure visualizations. We discuss a new literature curation workflow and a pipeline for building families based on RNAcentral. There are 236 new families in release 13.0, bringing the total number of families to 2687. The Rfam website is http://rfam.org. Nucleic Acids Res. 2018:46(D1) \| 681 Citations (from Europe PMC, 2025-12-13)
29927072	Non-Coding RNA Analysis Using the Rfam Database. [PMID: 29927072] Kalvari I, Nawrocki EP, Argasinska J, Quinones-Olvera N, Finn RD, Bateman A, Petrov AI. Abstract Rfam is a database of non-coding RNA families in which each family is represented by a multiple sequence alignment, a consensus secondary structure, and a covariance model. Using a combination of manual and literature-based curation and a custom software pipeline, Rfam converts descriptions of RNA families found in the scientific literature into computational models that can be used to annotate RNAs belonging to those families in any DNA or RNA sequence. Valuable research outputs that are often locked up in figures and supplementary information files are encapsulated in Rfam entries and made accessible through the Rfam Web site. The data produced by Rfam have a broad application, from genome annotation to providing training sets for algorithm development. This article gives an overview of how to search and navigate the Rfam Web site, and how to annotate sequences with RNA families. The Rfam database is freely available at http://rfam.org. © 2018 by John Wiley & Sons, Inc. Curr Protoc Bioinformatics. 2018:62(1) \| 315 Citations (from Europe PMC, 2025-12-13)
25577390	Rfam: annotating families of non-coding RNA sequences. [PMID: 25577390] Daub J, Eberhardt RY, Tate JG, Burge SW. Abstract The primary task of the Rfam database is to collate experimentally validated noncoding RNA (ncRNA) sequences from the published literature and facilitate the prediction and annotation of new homologues in novel nucleotide sequences. We group homologous ncRNA sequences into "families" and related families are further grouped into "clans." We collate and manually curate data cross-references for these families from other databases and external resources. Our Web site offers researchers a simple interface to Rfam and provides tools with which to annotate their own sequences using our covariance models (CMs), through our tools for searching, browsing, and downloading information on Rfam families. In this chapter, we will work through examples of annotating a query sequence, collating family information, and searching for data. Methods Mol Biol. 2015:1269() \| 27 Citations (from Europe PMC, 2025-12-20)
25392425	Rfam 12.0: updates to the RNA families database. [PMID: 25392425] Nawrocki EP, Burge SW, Bateman A, Daub J, Eberhardt RY, Eddy SR, Floden EW, Gardner PP, Jones TA, Tate J, Finn RD. Abstract The Rfam database (available at http://rfam.xfam.org) is a collection of non-coding RNA families represented by manually curated sequence alignments, consensus secondary structures and annotation gathered from corresponding Wikipedia, taxonomy and ontology resources. In this article, we detail updates and improvements to the Rfam data and website for the Rfam 12.0 release. We describe the upgrade of our search pipeline to use Infernal 1.1 and demonstrate its improved homology detection ability by comparison with the previous version. The new pipeline is easier for users to apply to their own data sets, and we illustrate its ability to annotate RNAs in genomic and metagenomic data sets of various sizes. Rfam has been expanded to include 260 new families, including the well-studied large subunit ribosomal RNA family, and for the first time includes information on short sequence- and structure-based RNA motifs present within families. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research. Nucleic Acids Res. 2015:43(Database issue) \| 807 Citations (from Europe PMC, 2025-12-20)
23125362	Rfam 11.0: 10 years of RNA families. [PMID: 23125362] Burge SW, Daub J, Eberhardt R, Tate J, Barquist L, Nawrocki EP, Eddy SR, Gardner PP, Bateman A. Abstract The Rfam database (available via the website at http://rfam.sanger.ac.uk and through our mirror at http://rfam.janelia.org) is a collection of non-coding RNA families, primarily RNAs with a conserved RNA secondary structure, including both RNA genes and mRNA cis-regulatory elements. Each family is represented by a multiple sequence alignment, predicted secondary structure and covariance model. Here we discuss updates to the database in the latest release, Rfam 11.0, including the introduction of genome-based alignments for large families, the introduction of the Rfam Biomart as well as other user interface improvements. Rfam is available under the Creative Commons Zero license. Nucleic Acids Res. 2013:41(Database issue) \| 591 Citations (from Europe PMC, 2025-12-20)
21062808	Rfam: Wikipedia, clans and the "decimal" release. [PMID: 21062808] Gardner PP, Daub J, Tate J, Moore BL, Osuch IH, Griffiths-Jones S, Finn RD, Nawrocki EP, Kolbe DL, Eddy SR, Bateman A. Abstract The Rfam database aims to catalogue non-coding RNAs through the use of sequence alignments and statistical profile models known as covariance models. In this contribution, we discuss the pros and cons of using the online encyclopedia, Wikipedia, as a source of community-derived annotation. We discuss the addition of groupings of related RNA families into clans and new developments to the website. Rfam is available on the Web at http://rfam.sanger.ac.uk. Nucleic Acids Res. 2011:39(Database issue) \| 303 Citations (from Europe PMC, 2025-12-20)
18953034	Rfam: updates to the RNA families database. [PMID: 18953034] Gardner PP, Daub J, Tate JG, Nawrocki EP, Kolbe DL, Lindgreen S, Wilkinson AC, Finn RD, Griffiths-Jones S, Eddy SR, Bateman A. Abstract Rfam is a collection of RNA sequence families, represented by multiple sequence alignments and covariance models (CMs). The primary aim of Rfam is to annotate new members of known RNA families on nucleotide sequences, particularly complete genomes, using sensitive BLAST filters in combination with CMs. A minority of families with a very broad taxonomic range (e.g. tRNA and rRNA) provide the majority of the sequence annotations, whilst the majority of Rfam families (e.g. snoRNAs and miRNAs) have a limited taxonomic range and provide a limited number of annotations. Recent improvements to the website, methodologies and data used by Rfam are discussed. Rfam is freely available on the Web at http://rfam.sanger.ac.uk/and http://rfam.janelia.org/. Nucleic Acids Res. 2009:37(Database issue) \| 676 Citations (from Europe PMC, 2025-12-20)
15608160	Rfam: annotating non-coding RNAs in complete genomes. [PMID: 15608160] Griffiths-Jones S, Moxon S, Marshall M, Khanna A, Eddy SR, Bateman A. Abstract Rfam is a comprehensive collection of non-coding RNA (ncRNA) families, represented by multiple sequence alignments and profile stochastic context-free grammars. Rfam aims to facilitate the identification and classification of new members of known sequence families, and distributes annotation of ncRNAs in over 200 complete genome sequences. The data provide the first glimpses of conservation of multiple ncRNA families across a wide taxonomic range. A small number of large families are essential in all three kingdoms of life, with large numbers of smaller families specific to certain taxa. Recent improvements in the database are discussed, together with challenges for the future. Rfam is available on the Web at http://www.sanger.ac.uk/Software/Rfam/ and http://rfam.wustl.edu/. Nucleic Acids Res. 2005:33(Database issue) \| 1203 Citations (from Europe PMC, 2025-12-20)
12520045	Rfam: an RNA family database. [PMID: 12520045] Griffiths-Jones S, Bateman A, Marshall M, Khanna A, Eddy SR. Abstract Rfam is a collection of multiple sequence alignments and covariance models representing non-coding RNA families. Rfam is available on the web in the UK at http://www.sanger.ac.uk/Software/Rfam/ and in the US at http://rfam.wustl.edu/. These websites allow the user to search a query sequence against a library of covariance models, and view multiple sequence alignments and family annotation. The database can also be downloaded in flatfile form and searched locally using the INFERNAL package (http://infernal.wustl.edu/). The first release of Rfam (1.0) contains 25 families, which annotate over 50 000 non-coding RNA genes in the taxonomic divisions of the EMBL nucleotide database. Nucleic Acids Res. 2003:31(1) \| 1089 Citations (from Europe PMC, 2025-12-20)

Rfam 15: RNA families database in 2025. [PMID: 39526405]

Ontiveros-Palacios N, Cooke E, Nawrocki EP, Triebel S, Marz M, Rivas E, Griffiths-Jones S, Petrov AI, Bateman A, Sweeney B.

Abstract

The Rfam database, a widely used repository of non-coding RNA families, has undergone significant updates in release 15.0. This paper introduces major improvements, including the expansion of Rfamseq to 26 106 genomes, a 76% increase, incorporating the latest UniProt reference proteomes and additional viral genomes. Sixty-five RNA families were enhanced using experimentally determined 3D structures, improving the accuracy of consensus secondary structures and annotations. R-scape covariation analysis was used to refine structural predictions in 26 families. Gene Ontology (GO) and Sequence Ontology annotations were comprehensively updated, increasing GO term coverage to 75% of families. The release adds 14 new Hepatitis C Virus RNA families and completes microRNA family synchronization with miRBase, resulting in 1603 microRNA families. New data types, including FULL alignments, have been implemented. Integration with APICURON for improved curator attribution and multiple website enhancements further improve user experience. These updates significantly expand Rfam's coverage and improve annotation quality, reinforcing its critical role in RNA research, genome annotation and the development of machine learning models. Rfam is freely available at https://rfam.org.

Nucleic Acids Res. 2025:53(D1) | 56 Citations (from Europe PMC, 2025-12-13)

Rfam 14: expanded coverage of metagenomic, viral and microRNA families. [PMID: 33211869]

Kalvari I, Nawrocki EP, Ontiveros-Palacios N, Argasinska J, Lamkiewicz K, Marz M, Griffiths-Jones S, Toffano-Nioche C, Gautheret D, Weinberg Z, Rivas E, Eddy SR, Finn RD, Bateman A, Petrov AI.

Abstract

Rfam is a database of RNA families where each of the 3444 families is represented by a multiple sequence alignment of known RNA sequences and a covariance model that can be used to search for additional members of the family. Recent developments have involved expert collaborations to improve the quality and coverage of Rfam data, focusing on microRNAs, viral and bacterial RNAs. We have completed the first phase of synchronising microRNA families in Rfam and miRBase, creating 356 new Rfam families and updating 40. We established a procedure for comprehensive annotation of viral RNA families starting with Flavivirus and Coronaviridae RNAs. We have also increased the coverage of bacterial and metagenome-based RNA families from the ZWD database. These developments have enabled a significant growth of the database, with the addition of 759 new families in Rfam 14. To facilitate further community contribution to Rfam, expert users are now able to build and submit new families using the newly developed Rfam Cloud family curation system. New Rfam website features include a new sequence similarity search powered by RNAcentral, as well as search and visualisation of families with pseudoknots. Rfam is freely available at https://rfam.org.

Nucleic Acids Res. 2021:49(D1) | 729 Citations (from Europe PMC, 2025-12-13)

Rfam 13.0: shifting to a genome-centric resource for non-coding RNA families. [PMID: 29112718]

Kalvari I, Argasinska J, Quinones-Olvera N, Nawrocki EP, Rivas E, Eddy SR, Bateman A, Finn RD, Petrov AI.

Abstract

The Rfam database is a collection of RNA families in which each family is represented by a multiple sequence alignment, a consensus secondary structure, and a covariance model. In this paper we introduce Rfam release 13.0, which switches to a new genome-centric approach that annotates a non-redundant set of reference genomes with RNA families. We describe new web interface features including faceted text search and R-scape secondary structure visualizations. We discuss a new literature curation workflow and a pipeline for building families based on RNAcentral. There are 236 new families in release 13.0, bringing the total number of families to 2687. The Rfam website is http://rfam.org.

Nucleic Acids Res. 2018:46(D1) | 681 Citations (from Europe PMC, 2025-12-13)

Non-Coding RNA Analysis Using the Rfam Database. [PMID: 29927072]

Kalvari I, Nawrocki EP, Argasinska J, Quinones-Olvera N, Finn RD, Bateman A, Petrov AI.

Abstract

Rfam is a database of non-coding RNA families in which each family is represented by a multiple sequence alignment, a consensus secondary structure, and a covariance model. Using a combination of manual and literature-based curation and a custom software pipeline, Rfam converts descriptions of RNA families found in the scientific literature into computational models that can be used to annotate RNAs belonging to those families in any DNA or RNA sequence. Valuable research outputs that are often locked up in figures and supplementary information files are encapsulated in Rfam entries and made accessible through the Rfam Web site. The data produced by Rfam have a broad application, from genome annotation to providing training sets for algorithm development. This article gives an overview of how to search and navigate the Rfam Web site, and how to annotate sequences with RNA families. The Rfam database is freely available at http://rfam.org. © 2018 by John Wiley & Sons, Inc.

Curr Protoc Bioinformatics. 2018:62(1) | 315 Citations (from Europe PMC, 2025-12-13)

Rfam: annotating families of non-coding RNA sequences. [PMID: 25577390]

Daub J, Eberhardt RY, Tate JG, Burge SW.

Abstract

The primary task of the Rfam database is to collate experimentally validated noncoding RNA (ncRNA) sequences from the published literature and facilitate the prediction and annotation of new homologues in novel nucleotide sequences. We group homologous ncRNA sequences into "families" and related families are further grouped into "clans." We collate and manually curate data cross-references for these families from other databases and external resources. Our Web site offers researchers a simple interface to Rfam and provides tools with which to annotate their own sequences using our covariance models (CMs), through our tools for searching, browsing, and downloading information on Rfam families. In this chapter, we will work through examples of annotating a query sequence, collating family information, and searching for data.

Methods Mol Biol. 2015:1269() | 27 Citations (from Europe PMC, 2025-12-20)

Rfam 12.0: updates to the RNA families database. [PMID: 25392425]

Nawrocki EP, Burge SW, Bateman A, Daub J, Eberhardt RY, Eddy SR, Floden EW, Gardner PP, Jones TA, Tate J, Finn RD.

Abstract

The Rfam database (available at http://rfam.xfam.org) is a collection of non-coding RNA families represented by manually curated sequence alignments, consensus secondary structures and annotation gathered from corresponding Wikipedia, taxonomy and ontology resources. In this article, we detail updates and improvements to the Rfam data and website for the Rfam 12.0 release. We describe the upgrade of our search pipeline to use Infernal 1.1 and demonstrate its improved homology detection ability by comparison with the previous version. The new pipeline is easier for users to apply to their own data sets, and we illustrate its ability to annotate RNAs in genomic and metagenomic data sets of various sizes. Rfam has been expanded to include 260 new families, including the well-studied large subunit ribosomal RNA family, and for the first time includes information on short sequence- and structure-based RNA motifs present within families. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

Nucleic Acids Res. 2015:43(Database issue) | 807 Citations (from Europe PMC, 2025-12-20)

Rfam 11.0: 10 years of RNA families. [PMID: 23125362]

Burge SW, Daub J, Eberhardt R, Tate J, Barquist L, Nawrocki EP, Eddy SR, Gardner PP, Bateman A.

Abstract

The Rfam database (available via the website at http://rfam.sanger.ac.uk and through our mirror at http://rfam.janelia.org) is a collection of non-coding RNA families, primarily RNAs with a conserved RNA secondary structure, including both RNA genes and mRNA cis-regulatory elements. Each family is represented by a multiple sequence alignment, predicted secondary structure and covariance model. Here we discuss updates to the database in the latest release, Rfam 11.0, including the introduction of genome-based alignments for large families, the introduction of the Rfam Biomart as well as other user interface improvements. Rfam is available under the Creative Commons Zero license.

Nucleic Acids Res. 2013:41(Database issue) | 591 Citations (from Europe PMC, 2025-12-20)

Rfam: Wikipedia, clans and the "decimal" release. [PMID: 21062808]

Gardner PP, Daub J, Tate J, Moore BL, Osuch IH, Griffiths-Jones S, Finn RD, Nawrocki EP, Kolbe DL, Eddy SR, Bateman A.

Abstract

The Rfam database aims to catalogue non-coding RNAs through the use of sequence alignments and statistical profile models known as covariance models. In this contribution, we discuss the pros and cons of using the online encyclopedia, Wikipedia, as a source of community-derived annotation. We discuss the addition of groupings of related RNA families into clans and new developments to the website. Rfam is available on the Web at http://rfam.sanger.ac.uk.

Nucleic Acids Res. 2011:39(Database issue) | 303 Citations (from Europe PMC, 2025-12-20)

Rfam: updates to the RNA families database. [PMID: 18953034]

Gardner PP, Daub J, Tate JG, Nawrocki EP, Kolbe DL, Lindgreen S, Wilkinson AC, Finn RD, Griffiths-Jones S, Eddy SR, Bateman A.

Abstract

Rfam is a collection of RNA sequence families, represented by multiple sequence alignments and covariance models (CMs). The primary aim of Rfam is to annotate new members of known RNA families on nucleotide sequences, particularly complete genomes, using sensitive BLAST filters in combination with CMs. A minority of families with a very broad taxonomic range (e.g. tRNA and rRNA) provide the majority of the sequence annotations, whilst the majority of Rfam families (e.g. snoRNAs and miRNAs) have a limited taxonomic range and provide a limited number of annotations. Recent improvements to the website, methodologies and data used by Rfam are discussed. Rfam is freely available on the Web at http://rfam.sanger.ac.uk/and http://rfam.janelia.org/.

Nucleic Acids Res. 2009:37(Database issue) | 676 Citations (from Europe PMC, 2025-12-20)

Rfam: annotating non-coding RNAs in complete genomes. [PMID: 15608160]

Griffiths-Jones S, Moxon S, Marshall M, Khanna A, Eddy SR, Bateman A.

Abstract

Rfam is a comprehensive collection of non-coding RNA (ncRNA) families, represented by multiple sequence alignments and profile stochastic context-free grammars. Rfam aims to facilitate the identification and classification of new members of known sequence families, and distributes annotation of ncRNAs in over 200 complete genome sequences. The data provide the first glimpses of conservation of multiple ncRNA families across a wide taxonomic range. A small number of large families are essential in all three kingdoms of life, with large numbers of smaller families specific to certain taxa. Recent improvements in the database are discussed, together with challenges for the future. Rfam is available on the Web at http://www.sanger.ac.uk/Software/Rfam/ and http://rfam.wustl.edu/.

Nucleic Acids Res. 2005:33(Database issue) | 1203 Citations (from Europe PMC, 2025-12-20)

Rfam: an RNA family database. [PMID: 12520045]

Griffiths-Jones S, Bateman A, Marshall M, Khanna A, Eddy SR.

Abstract

Rfam is a collection of multiple sequence alignments and covariance models representing non-coding RNA families. Rfam is available on the web in the UK at http://www.sanger.ac.uk/Software/Rfam/ and in the US at http://rfam.wustl.edu/. These websites allow the user to search a query sequence against a library of covariance models, and view multiple sequence alignments and family annotation. The database can also be downloaded in flatfile form and searched locally using the INFERNAL package (http://infernal.wustl.edu/). The first release of Rfam (1.0) contains 25 families, which annotate over 50 000 non-coding RNA genes in the taxonomic divisions of the EMBL nucleotide database.

Nucleic Acids Res. 2003:31(1) | 1089 Citations (from Europe PMC, 2025-12-20)

URL:	https://rfam.org
Full name:	RNA Families
Description:	The Rfam database, a widely used repository of non-coding RNA families, has undergone significant updates in release 15.0. This paper introduces major improvements, including the expansion of Rfamseq to 26 106 genomes, a 76% increase, incorporating the latest UniProt reference proteomes and additional viral genomes. Sixty-five RNA families were enhanced using experimentally determined 3D structures, improving the accuracy of consensus secondary structures and annotations. R-scape covariation analysis was used to refine structural predictions in 26 families. Gene Ontology (GO) and Sequence Ontology annotations were comprehensively updated, increasing GO term coverage to 75% of families.
Year founded:	2003
Last update:	2024-09
Version:	15.0
Accessibility:	Accessible
Country/Region:	United Kingdom

Data type:	RNA
Data object:	Animal Archaea Bacteria Fungi Plant Virus
Database category:	Gene genome and annotation
Major species:	Homo sapiens Escherichia coli Drosophila melanogaster Mus musculus Caenorhabditis elegans Bacillus subtilis
Keywords:	covariance model ncRNA RNA family RNA motif secondary structure sequence alignment

University/Institution:	European Bioinformatics Institute
Address:	Wellcome Genome Campus, Hinxton, Cambridge, UK
City:	Cambridge
Province/State:	Cambridgeshire
Country/Region:	United Kingdom
Contact name (PI/Team):	Blake Sweeney
Contact email (PI/Helpdesk):	bsweeney@ebi.ac.uk

Database Commons
a catalog of worldwide biological databases

a catalog of worldwide biological databases

Database Profile

General information

Classification & Tag

Contact information

Publications

Ranking

Community reviews

Word cloud

Tags

Related Databases

Record metadata

Database Commons a catalog of worldwide biological databases

a catalog of worldwide biological databases

Database Profile

Rfam

General information

Classification & Tag

Contact information

Publications

Ranking

Community reviews

Word cloud

Tags

Related Databases

Record metadata

Database Commons
a catalog of worldwide biological databases