#### README #### #################### Download--Fasta ####################### Fasta DNA dumps ####################### ----------- FILE NAMES ------------ The files are consistently named following this pattern: .fna.gz : The abbreviated name of a species. fna: All files in these directories represent FASTA database files gz: All files are compacted with GNU Zip for storage efficiency. EXAMPLES The genomic sequence of Populus trichocarpa: PopTri.fna.gz #################### Download--Protein #################### Fasta Peptide dumps #################### These files hold the protein translations of PPGR genes. ----------- FILE NAMES ------------ The files are consistently named following this pattern: .pep.gz : The abbreviated name of a species. pep: All files in these directories represent FASTA database files gz: All files are compacted with GNU Zip for storage efficiency. EXAMPLES The protein sequence of Populus trichocarpa: PopTri.pep.gz #################### Download--Annotation #################### GFF FLATFILE dumps #################### Gene annotation is provided in GFF3 format. The 'type' of gene features is: * "gene" for protein-coding genes * "ncRNA_gene" for RNA genes * "pseudogene" for pseudogenes The 'type' of transcript features is: * "mRNA" for protein-coding transcripts * a specific type or RNA transcript such as "snoRNA" or "lnc_RNA" * "pseudogenic_transcript" for pseudogenes All transcripts are linked to "exon" features. Protein-coding transcripts are linked to "CDS" features. Attributes for feature types: (square brackets indicate data which is not available for all features) * gene types: * biotype: biotype, e.g. "gene", "pseudogene" * gene_id: gene stable ID * [Name]: Gene name * [description]: Gene description * transcript types: * Parent: Gene identifier, format "" * biotype: biotype, e.g. "mRNA", "transcript" * transcript_id: transcript stable ID * exon * Parent: Transcript identifier, format "" * exon_id: exon stable ID * CDS * Parent: Transcript identifier, format "" * CDS_id: CDS stable ID ----------- FILE NAMES ------------ The files are consistently named following this pattern: .gff.gz : The abbreviated name of a species. gff: All files in these directories are in GFF3 format gz: All files are compacted with GNU Zip for storage efficiency. EXAMPLES The annotation file of Populus trichocarpa: PopTri.gff.gz