The ChickenGTEx Project

Large-scale genome-wide association studies (GWAS) have demonstrated that the majority of the variants related to complex traits and adaptive evolution are located in non-coding genomic regions, which are believed to regulate gene activity/structure and contribute to phenotypic variation. To sustain food and agriculture production while minimizing negative environmental effects, it is indispensable to investigate systematically the functional consequences of genetic variants on the transcriptome of whole-body systems in farm animals under various biological contexts (such as tissues, cell types, developmental periods, gender, environmental exposures, and genetic backgrounds). The international Farm Animal Genomic Tissue Expression (FarmGTEx) project was lunched to build a comprehensive open-access atlas of regulatory variants in domestic animal species. The project can facilitate our understanding of the genetic regulatory circuitry underlying economically valuable traits and animal welfare, and pave the way for the next generation of livestock breeding industry, i.e., precision breeding.

The chicken serves not only as the most abundant source of protein-rich food for human nutrition and health but also as a fundamental model organism. Due to its distinguishing characteristics in evolution, physiology, and genetics, it provides novel insights into various aspects of fundamental biology, including embryonic development, virology, immunology, avian-specific features like flight, vertebrate genome evolution, natural and human-imposed selection, and human biomedicine. Chickens have undergone significant genetic and phenotypic divergences as a result of long-term intensified selection. Despite being the first sequenced farm animal and avian species, there is a significant lack of systematic characterization of the regulatory mechanisms behind non-coding variants that affect transcriptomes and variable phenotypes.

The Chicken Genotype-Tissue Expression (ChickenGTEx) project, as an essential part of the international Farm Animal GTEx project, aims to build the ChickenGTEx Atlas, which serves as a comprehensive catalogue of regulatory effects of genomic variants on chicken transcriptomic and phenotypic diversities across different biological contexts, spanning diversified genetic backgrounds, sexes, tissues, cell types, and chromatin statuses. The ChickenGTEx Atlas will serve as a valuable reference for fundamental biology (e.g., genetic regulations related to embryonic development, immune response, and tissue/cell type specificity), bird or vertebrate evolution, comparative transcriptomes, applied genetics (e.g., chicken precision breeding, and food security), and human biomedicine.

The Portal Summary

In the current version (pilot) of ChickenGTEx, the following information is provided:

Whole-genome sequencing (WGS) data: Based on 2,869 WGS data from 152 chicken breeds, the ChickenGTEx Atlas provides a high-quality multiple-breed genotype imputation panel, which is a state-of-the-art panel for conducting further genetic studies.
Molecular phenotypes: Based on 7,015 bulk RNA-Seq data of 52 chicken tissues, the ChickenGTEx Atlas profiles five distinct transcriptomic phenotypes, including protein-coding gene expression, exon expression, long non-coding RNA expression, 3’UTR alternative polyadenylation (APA), and alternative splicing variations across 28 tissues with sufficient sample size (ranging from 44 to 741).
Regulatory variants associated with molecular phenotypes (cis-molQTLs): By integrating 2,869 WGS data and 7,015 bulk RNA-Seq data, the ChickenGTEx Atlas has identified approximately 1.5 million genomic variants that are significantly associated with five molecular phenotypes across 28 tissues, including 2,971,719 cis-eQTLs for protein-coding gene (eGenes) expression, 12,593,157 cis-exQTLs for exon (exGenes) expression, 2,394,728 cis-lncQTLs for long non-coding RNA (lncGenes) expression, 970,651 cis-3’aQTL for 3’UTR APA (3’aGenes), and 3,218,604 cis-sQTLs for alternative splicing variations (sGenes), respectively.
Putative causal variants: Based on fine-mapping analysis, the ChickenGTEx Atlas provides a variety of potential causal variants that influence five molecular phenotypes across 28 tissues, including 92,617 fine-mapping eQTLs for eGenes, 542,741 fine-mapping exQTLs for exGenes, 60,875 fine-mapping lncQTLs for lncGenes, 31,344 fine-mapping 3’aQTL for 3’aGenes, and 78,652 fine-mapping sQTLs for sGenes, respectively. The resource facilitates the understanding of what and how genomic variants affect gene expression regulation.
Single-cell expression: By analyzing 203,540 single-cell RNA-Seq datasets spanning 10 different tissues (liver, breast muscle, bursa, heart, spleen, lung, retina, amygdala, cerebellar, and mesencephalon), the ChickenGTEx Atlas accommodates 99 cell type clusters and 212,507 single-cell gene expression profiles. The resource profiles the cell type composition in tissues, gene expression hierarchy in cell types, and enables the deconvolution of bulk RNA-Seq data.
Context-dependent molQTLs: Through investigating the regulatory effects of genetic variants in the context of sexes, cell types, and tissues, the ChickenGTEx Atlas provides 1,138 sex-biased eQTLs, 703 cell type-biased eQTLs, and 59 tissue-biased eQTLs. This resource elucidates the context-specificity of the regulatory effects for genomic variants.
Chromatin status: The ChickenGTEx Atlas includes 257 epigenomes across 23 tissues, consisting of the binding profiles of 4 histone modifications (H3K4me1, H3K4me3, H3K27ac, and H3K27me3) and 3 transcription factors or enzymes (ATAC, CTCF, and DNaseSeq), as well as 15 deduced chromatin states based on the epigenomes. This resource facilitates the interpretation of the regulatory mechanisms of genomic variants on molecular phenotypes.
Regulatory variants relevant to complex traits: Based on the transcriptome-wide association studies (TWAS) by integrating the regulatory variations and GWAS associations for 108 complex traits of economic importance involving growth, development, egg production, feed intake and efficiency, the ChickenGTEx Atlas encompasses 96,386 significant gene-trait associations, interpreting genetic mechanisms, i.e., genetic regulations in causal tissues behind complex traits.

In summary, the ChickenGTEx Atlas represents the most extensive compilation of genetic regulatory effects on chicken transcriptomes across different tissues and cell types to date. The portal can be illustrated by the following diagram:

The Outline of The Analysis

The analysis outline is depicted in the following figure.

The ChickenGTEx github repository

https://github.com/FarmGTEx/ChickenGTEx-Pipeline-v0

Gaps

In the current (pilot) phase of ChickenGTEx, through integrating all the public available RNA-Seq and WGS datasets in chickens, we have figured out several gaps below. If you are interested in filling these gaps or other gaps that may not be listed here, please free feel to join us. The global coordinated efforts are required to fully develop this valuable resource.

Generating RNA-Seq and WGS data from under-represented tissues and cell types. Here we grouped tissues into three categories according to their sample size in the current pilot phase: 1) top-represented tissues with a sample size > 500, such as muscle, embryo, spleen, neural crest, and liver; 2) medium-represented tissues with a sample size > 200 and < 500, including blood, brain, leukocytes, skin, and heart; 3) under-represented tissues with a sample size < 200 or those that have not been studied yet.
Consider various biological contexts such as genders, developmental periods, genetic backgrounds, environments, and physiological responses, to develop the context-dependent ChickenGTEx.
SV-based ChickenGTEx: Incorporate structure variants (SV) in ChickenGTEx through building the pan-genome reference panel.
Benchmark and develop bioinformatics pipelines, statistical methods, and computational tools.
Functionally annotate/validate regulatory variants at large scale.
Applications of ChickenGTEx resources: genomic prediction, biotechnology development, population genetics, cross-species comparisons, et al.
Define novel molecular phenotypes (e.g., DNA methylation and RNA binding proteins) using new biotechnologies.
Define novel phenotypes of complex traits (e.g., the metabolites).
Database Usages

1. Home Page

You can fill in the entry of interest in the input box to search and view the corresponding data and information that you need.
For example,you can click on "gene" and search for gene names like ENSGALG00000000003 or PANX2 to inquire about its location, length, gene expression analysis across different tissues, and their QTLs information. By clicking on "Tissue" and "variant", you can also search for relevant data and information. You can also directly click the specific areas in the four sections below, "WGS Data", "Gene expression", "Publications", and "News and Events", to jump to the overview page of the corresponding section.

2. Data

2.1 Data Overview

In the "Data Overview" section, you can view a comprehensive summary of all available data. By selecting options on the left, such as "Molecular Phenotype", you'll be redirected to its respective overview. Hovering over each bar on the right-side bar chart, for example, the "Molecular Phenotype" for the Spleen tissue, will display detailed information specific to the Molecular Phenotype of the Spleen. Below, you have options to choose how many entries to display per page: 10, 25, 50, or 100.

2.2 Data Resource

This showcases the projects from which the website's data is sourced.

2.3 Data Download

Here, you can select and download the datasets of your interest.

3. Molecular Phenotype Page

Hover over the "Molecular Phenotype" option at the top of the Home page to access various categories. By clicking on the respective categories - Protein Coding Gene (PCG) Expression, Exon Expression, lncRNA Expression, 3'UTR Alternative Polyadenylation, or Alternative Splicing, you can view the relevant information.

For instance, within the "Protein Coding Gene (PCG) Expression" section. you have the flexibility to choose how many entries you'd like to view per page: 10, 25, 50, or 100. Additionally, if you're looking for specific information, use the search bar by entering either an Ensembl ID or RefSeq ID. For more detailed insights on individual genes, simply click on the links under the "Tissue Expression" and "Gene Specificity" columns. This will provide you with the relevant Bulk Tissue Gene Expression and Specificity parameter data.

4. QTL Page

Navigate to the specific QTL types by clicking on "QTL" from the home page, where you can select from Cis-molQTL, Fine-mapping molQTL, or Context-dependent molQTL.

Cis-molQTL: For instance, by selecting "eQTLs" under Cis-molQTL, you'll be directed to a new page. On the left-hand side, you'll find a "Filter" column where you can refine your view by "Gene" or "Tissue". Additionally, in the table's second row, you have search bars under the columns "Gene ID", "Accession", "Variant ID", and "Allele Freq" for specific information filtering.
Fine-mapping molQTL: Using "eQTLs" as an example, after selecting "eQTLs" under Fine-mapping molQTL, you'll see a similar layout. The left side offers a "Filter" column to refine by "Gene" or "Tissue". Within the table, specific columns come with search bars allowing for column-specific data filtering.
Context-dependent Cis-eQTL: You can tailor the table's displayed information by selecting one of the three options on the left: "sex-biased eQTLs", "Cell type-biased eQTLS", or "Tissue-biassed eQTLs".

5. Epigenetic Data Page

By clicking on "Epigenetic Data" on the home page, you'll be taken to a new page. Refine the displayed data using the search bar located at the top right of the table. By clicking on the blue "view" link within the table, you'll be presented with the IGV view for that specific data entry.

6. Single Cell Page

Click on the "Single Cell" tab at the top of the Home page to navigate to the Single Cell Page. This page showcases an annotated map of Cell Clusters. Directly below the tissue name, you'll find options to zoom into the map, reset the view, or download the visual. To dive deeper, simply click on the blue tissue name. This will lead you to a specific page where you can search for your gene of interest using the provided search bar. Results will display both dot plots and violin plots for the gene expression in the selected tissue. Hovering over the charts will reveal detailed gene expression data.

7. TWAS Page

Navigate to the "Twas" page by clicking on the "Twas" tab located at the top of the Home page. To search for any of the following information types across all data, utilize the search bar located at the top-right of the table: molQTL, Tissue, Trait Category, Gene ID, RefSeq ID, and Gene Name. Additionally, you can filter specific column data by entering search terms in the search bars provided in the second row of each column in the data table.

8. IGV Page

On the IGV Page, click "Visualization" to personalize the display of project details in the table. On the left side of the page, you have options to choose from Core GTEx Data, Enhancer Epigenetic, or Chromosome State. After making your selections, further refine by picking a tissue type and then click on the "Apply change" button to display the IGV view tailored to your chosen details. You also have the capability to select specific chromosomes and their positional information to exhibit the corresponding view. Above the view, you can opt to display the center line, cursor guide, and track label. Furthermore, there's an option available for downloading the current view.

License

The ChickenGTEx Atlas and all contents contained in the portal are made available under the following license: Prior to downloading and using data from the ChickenGTEx-Portal servers, users are required to accept the following Terms and Conditions of Use:

The data of ChickenGTEx Atlas are available free for public use at this portal (no charges, usage fees or royalties). For any commercial use, please contact us for commercial licensing terms.
The data of ChickenGTEx Atlas are produced with a reasonable standard of care, but the FarmGTEx-ChickenGTEx Consortium makes no warranties express or implied, including no warranty of merchantability or fitness for particular purpose, regarding the accuracy or completeness of the data.
Users agree to hold the ChickenGTEx Project and the FarmGTEx-ChickenGTEx Consortium harmless from any liability resulting from errors in the data.
The FarmGTEx-ChickenGTEx Consortium disclaims any liability for any consequences due to use, misuse, or interpretation of information contained or not contained in the data.
Cite the publications of the ChickenGTEx Atlas and acknowledge the ChickenGTEx-Portal as the source of the data where appropriate in the manuscript by including the phrase “The data used for the analyses described in this manuscript were obtained from the ChickenGTEx-Portal (MM/DD/YY, http://chickengtex.farmgtex.org/).” in a clear and conspicuous manner.
The FarmGTEx-ChickenGTEx Consortium does not provide legal advice regarding copyright, fair use, or other aspects of intellectual property rights.
The FarmGTEx-ChickenGTEx Consortium reserves the right to the final interpretation of the above terms and the ChickenGTEx resources. The FarmGTEx-ChickenGTEx Consortium will take reasonable steps to inform users of news and events of the ChickenGTEx Project via the portal or email (news@farmgtex.org, click here to subscribe).
Citation

Dailu Guan†, Zhonghao Bai†, Xiaoning Zhu†, Conghao Zhong†, Yali Hou†, The ChickenGTEx Consortium, Fangren Lan, Shuqi Diao, Yuelin Yao, Bingru Zhao, Di Zhu, Xiaochang Li, Zhangyuan Pan, Yahui Gao, Yuzhe Wang, Dong Zou, Ruizhen Wang, Tianyi Xu, Congjiao Sun, Hongwei Yin, Jinyan Teng, Zhiting Xu, Qing Lin, Shourong Shi, Dan Shao, Fabien Degalez, Sandrine Lagarrigue, Ying Wang, Mingshan Wang, Minsheng Peng, Dominique Rocha, Mathieu Charles, Jacqueline Smith, Kellie Watson, Albert Johannes Buitenhuis, Goutam Sahana, Mogens Sandø Lund, Wesley Warren, Laurent Frantz, Greger Larson, Susan J. Lamont, Wei Si, Xin Zhao, Bingjie Li, Haihan Zhang, Chenglong Luo, Dingming Shu, Hao Qu, Wei Luo, Zhenhui Li, Qinghua Nie, Xiquan Zhang, Zhe Zhang, Zhang Zhang, George E. Liu, Hans Cheng, Ning Yang*, Xiaoxiang Hu*, Huaijun Zhou*, Lingzhao Fang*. The ChickenGTEx pilot analysis: a reference of regulatory variants across 28 chicken tissues. bioRxiv. 2023.
https://www.biorxiv.org/content/10.1101/2023.06.27.546670v1

Copyright © the FarmGTEx-ChickenGTEx Consortium.
To provide feedback or ask a question, contact us on chickengtex(AT)farmgtex.org