RNA editing is a crucial co-/post-transcriptional modification of RNA that plays a significant role in various human diseases, including autoimmune, cardiovascular, neurological diseases, and cancers. Understanding the regulatory mechanisms underlying RNA editing can provide valuable insights into disease diagnosis and therapy. To offer a comprehensive collection based on literature curation and integrative analysis, we established and updated the Editome Disease Knowledgebase (EDK).
EDK incorporates a total of 1,097 editome-disease associations based on literature curation, 709 associations with 628 RNA editing sites, 181 associations with 46 viruses, and 97 associations with 11 enzymes across 115 human diseases from 321 publications. Meanwhile, EDK newly collected and provided 48 original RNA editing datasets by a unified analysis process. Based on integrative analysis, EDK comprised 7,190 differential edited genes harboring 65,205 A-to-I and 21,037 C-to-U differential edited sites that are identified in human diseases. Moreover, a collection of 28,160 cis-RNA editing QTL associations, 458,187 RNA editing-RNA binding protein associations, and 21 RNA editing-RNA secondary structure associations are newly added. Additionally, EDK is equipped with a series of user-friendly tools to further enhance the usability and reliability as a resource for studying RNA editing, including editing site identification, disease prediction, gene-disease network construction, cross-disease analysis, and editing site annotation. These tools facilitate the identification and annotation of RNA editing sites, as well as the discovery of RNA editing-disease and disease-disease associations.
To provide a comprehensive and elaborate search, we establish an advanced search by RNA editing sites through chromosome, start position, end position, disease, editing type, gene name, editing region, molecular effect, and tissue. Finally, a collection of RNA editome-disease associations based on literature curation and integrative analysis is provided.
All 147 diseases associated with RNA editing are incorporated into 18 disease classification systems based on Disease Ontology (DO), using standardized DO ID and descriptive terms. We initially divide these diseases into different categories: 115 diseases reported in literature, 45 diseases analyzed in data, and 13 diseases both in literature and data. Each disease in EDK is assigned with a unique page and associated with a wealth of relevant information, such as edited genes, editing sites, enzymes, viral associations, differential edited genes, and differential editing sites.
Data type | Description | Value |
---|---|---|
Enzyme | Controlled vocabulary | ADAR1, ADAR2, ADAR3, AID, APOBEC1, APOBEC3 |
Editing type | Controlled vocabulary | Substitution(A-to-I, G-to-A, C-to-U, U-to-C, G-to-C), Insertion, Deletion |
Editing region | Controlled vocabulary | CDS, Exon, Intron, 5'UTR, 3'UTR, Downstream |
Molecular consequence | Controlled vocabulary | Nonsynonymous substitution; mRNA expression change; mRNA stability change; Alternative splicing… |
Editing level | Controlled vocabulary | Increased; Decreased; Present; Absent; Similar |
Relationship | Controlled vocabulary | Causative; Correlated |
Correlation | Controlled vocabulary | Positive; Negative |
Editing effect | The regulation mechanism of RNA editing affecting disease | eg:RNA-editing in the CTSS 3'UTR can knockdown the CTSS mRNA expression and deficiency of cathepsin S reduces atherosclerosis and angiogenesis in vivo. |
Phenotype | The influence of editing on disease | Promote the disease risk; Promote tumor cell proliferation… |
Description | This item describes in detail how the RNA editing event acts on genes and influences disease | eg: RNA-editing in the CTSS 3'UTR will knockdown the CTSS mRNA expression; and RNA-editing of CTSS mRNA also regulates the RNA-binding protein HuR; deficiency of cathepsin S reduces atherosclerosis and angiogenesis in vivo. And it will reduce the binding efficiency of HuR; which will leads to decrease the stability of CTSS mRNA. |
Disease | Disease associated with RNA editing | Atherosclerosis; Alzheimer's Disease; Hepatocellular Carcinoma… |
Target change (miRNA) | The edited miRNA may change its target gene | PCDH9->ADAM12; Rab15->Rab15; CPEB1->CPEB1… |
Regulation target (miRNA) | The changes of target gene or target gene expression regulated by edited miRNA | Retarget ADAM12 replaces original target PCDH9; Increase Rab15 expression; Inhibit ZEB1 expression… |
Target (lncRNA) | lncRNA target | PRUNE2 |
Edited gene (virus) | The edited virus gene | Phosphoprotein gene; Glycoprotein gene… |
Virus protein (virus) | Protein composition of the virus | M,R,D,V,P,V,I,S,R,Y,Y; N,P/C,M,F,H,N,L… |
Regulator (enzyme) | Protein or interferon regulate the expression or mutation of enzyme | JAK2; IFN… |
Expression/Aberrance (enzyme) | Mutation or expression change of enzyme | Mutation; Expression Increase… |
Target (enzyme) | The aberrant enzyme can affect the RNA editing of target molecular | AZIN1; STAT1; miR-222; virus HBV… |
Validation strategy | Validation strategy of editome-disease associations | RT-PCR; siADAR enzymes; Immunohistochemistry… |
Cell type | Aberrant RNA editing occurring in various cell types | Podocyte; Endothelial cell; T cell… |
Drug response | Clinical treament information for diseases | Influenza vaccination; Methotrexate; Doxorubicin… |
High-throughput sequencing data has greatly enriched the study of RNA editing. EDK2.0 newly collected and provided 48 original RNA editing datasets by a unified analysis process. For each datasets, users could obtain integrative analysis results including differential edited genes and differential editing sites. For each differential edited gene, RNA editing and expression levels between disease and healthy conditions are displayed.
Quantitative trait loci (QTL) analysis has been successfully used to identify cis-regulatory mechanisms of quantifiable phenotypes including RNA editing. Here we collect 3,052 differential RNA editing sites associated with 4,834 cis-RNA editing QTLs (edQTLs) across 1,495 disease traits from 1,194 publications (Li, Q. et al, Nature, 2022) and integrate 28,160 associations between differential RNA editing sites and edQTLs in EDK. Meanwhile, the disease information of differential editing sites could be also found in the "View Detail" column.
RNA binding proteins (RBPs) play a crucial role in modulating transcription and translation, often exhibiting dysregulation in human cancers and diseases. RNA editing sites that are located within the RNA binding target sequences of RBPs potentially impact the interaction dynamics between the RBPs and their target genes. Here we collect 71,284 differential RNA editing sites associated with 75 RBPs based on the ENCORI/starBase database (Li JH. et al, Nucleic Acids Research, 2014) and integrate 458,187 associations between differential RNA editing sites and RBPs in EDK. Meanwhile, the disease information of differential editing sites could be also found in the "View Detail" column.
RNA secondary structure could be affected by RNA editing, causing local or global conformational change. Here we collect 18 differential RNA editing sites associated with changes of RNA secondary structure from 2 publications (Wan Y. et al, Nature, 2014; Lin J. et al, NAR Genome Bioinformation, 2020) and integrate 21 associations between differential RNA editing sites and RNA secondary structure in EDK. Meanwhile, the disease information of differential editing sites could be also found in the "View Detail" column.
With the development of high-throughput sequencing technology, more and more RNA editing sites are detected based on RNA-seq sequences. To help users determine the interested editing sites, we developed "Editing Site Identification". Besides EDK, it also contains two data sources, namely REDIportal and DARNED, to expand RNA editing sites in normal or case condition. "Editing Site Identification" provides an important function to determine whether user-uploaded sites are RNA editing sites and offers basic information of these sites. Additionally, it will also indicate the link between the site and diseases in EDK.
To discovery diseases might be affected by a cluster of interested editing sites, we developed "Disease Prediction", an algorithm tool based on hypergeometric test method. "Disease Prediction" provides a list of predictive associations according to the differential RNA editing sites (including hit ratio, average risk score, and max risk score), which helps users to assess the strength of the relationship between a cluster of interested editing sites and specific diseases. Additionally, it also provides associations between these sites and diseases based on literature curation.
Due to genes that are frequently edited have been linked to a series of diseases, we developed "Gene-Disease Network Construction" to explore the associations between an edited gene and diseases, or some certain edited genes and diseases. The edge width represents the number of differential editing sites on this gene. Blue represents associations from integrative analysis. Green represents associations from literature curation. Orange represents associations from both integrative analysis and literature curation.
To achieve comparative analysis among different diseases, we developed "Cross-Disease Analysis" based on integrative analysis. "Cross-Disease Analysis" provides disease-specific and disease-common differential edited genes, as well as disease-specific and disease-common differential editing sites among some certain diseases. Meanwhile, it also provides detailed information on these edited genes and editing sites. It supports up to five disease options.
To provide a comprehensive annotation of RNA editing sites, we developed "Editing Site Annotation" based on featured annotations, including cis-RNA editing QTL associations, editing-RNA binding protein associations, and editing-RNA secondary structure associations.
The statistic module provides the updated information from EDK V2.0 to V1.0. Meanwhile, it also contains all diseases associated with RNA editing in EDK, and offers statistics of associations between diseases and RNA editing sites, viruses, and enzymes. Additionally, users can also get access to the top of hot differential RNA edited genes and diseases based on integrative analysis.
An overarching goal of Editome Disease Knowledgebase (EDK) is to create RNA editing resources for the research community. EDK strives to provide high-quality, accessible RNA editing-disease associations to the research community based on literature curation and integrative analysis. Moreover, the original RNA editing profiles are generally available via open-access when possible.s