DogGD
How to use Dog Expression Data ?

Dog Expression houses gene expression profiles derived entirely from RNA-Seq data analysis on tissues from Canis. Dog Expression features the integration and visualization of gene expression profiles based on curated and quality-controlled RNA-Seq data encompassing diverse tissues. The complete landscape of gene expression for tissues, breeds, and cell lines is provided, as well as the differential gene expression associated with diseases.

For RNA-Seq data, iDog has constructed a pipeline to process the data. The data analysis code for transcriptomics and differentially expressed genes in disease is available for free on GitHub (https://github.com/Br1anChou/idog).

Raw reads were filtered using fastp (v 0.23.2) with the parameters ‘-g -q 5 -u 50 -n 5’. The filtered reads were then aligned to the genome using STAR (v 2.7.3a) with the following parameters ‘--outFilterMultimapNmax 20 --alignSJoverhangMin 8 --alignSJDBoverhangMin 1 --outFilterMismatchNmax 999 --outFilterMismatchNoverReadLmax 0.04 --alignIntronMin 20 --alignIntronMax 1000000 --alignMatesGapMax 100000 --outSAMunmapped Within --outFilterType BySJout --outSAMattributes NH HI AS NM MD --outSAMtype BAM SortedByCoordinat --quantMode TranscriptomeSAM --sjdbScore 1’. The quantification of transcripts was performed using Kallisto (v0.46.0) with the parameters ‘--fusion --plaintext’, while gene quantifications were generated using the RSEM program (v 5.32.1) with the parameters ‘--estimate-rspd --seed 12345 --forward-prob 0.5’. TPM (Transcripts per kilobase of exon model per million mapped reads) is used as a normalized value for transcript and gene abundance to eliminate the effects of varying sequencing depths and gene lengths. For samples from the same tissue across different projects, log2(TPM+1) transformation was applied, and the results were visualized using box plots.
Differential expression analysis was performed in R using the negative binomial generalized linear model provided by DESeq2 to test for differential expression of expected counts. Gene counts of less than 10 were removed, and variance-stabilizing transformations (VST) were applied to eliminate the dependence of variance on the mean. After model fitting, the coefficients and their standard errors for each sample group were estimated. The fold change (FC) was used as input for hypothesis testing, and the significance of differential expression was assessed using the Wald test. The multiple test correction method utilized was the Benjamini-Hochberg false discovery rate (FDR). Additionally, we utilized the transformed count matrix to calculate sample-to-sample distances.
1. The whole landscape gene expression

The gene expression for all tissues are provided, as well as gene specificity.

2. The differential gene expression

We analyze differential gene expression for 19 diseases, including Anemia, Bcell Lymphoma and so on. The volcap map, tissue relevance map and box plot map are provided.

3. Search data

Users can search the gene by various situations including name, chromosome region and go annotation term. A batch search for input a gene list is also available.

A gene search list web pages as follows.