IC4R009-RNA-Seq-2015-25713814

From RiceWiki
Jump to: navigation, search

Project Title

De Novo Assembly and Characterization of Oryza officinalis Leaf Transcriptome by Using RNA-Seq

The Background of This Project

Figure 1. Assemblies of length distribution
  • O.officinalis is a perennial wild rice that is distributed in South and Southeast Asia, South China, Papua New Guinea, and Australia.Further investigation of the genetic basis of O. officinalis will not only provide more opportunities to discover valuable genes that may improve the quality of cultivated rice but will also serve as a bridge to extend further study to other allopolyploids or diploids that contain CC genomes. In this study, researchers using next-generation sequencing technology to investigate the leaf transcriptome of a wild rice O. officinalis with CC genome types.

Plant Culture & Treatment

Sg8-RNA-Seq-2015-25713814-2.png
  • Seeds of three biological replicas of O. officinalis (Acc. number 104973) from the International Rice Research Institute (IRRI, Manila, Philippines) were dehulled and heated at 50 ℃ for five days to break dormancy and were subsequently immersed in warm water at 30 ℃ for three days to germinate.
  • The germinated seeds were planted in three small pots at 24 ℃ for two weeks, and the seedlings were transplanted into three large pots (30 × 30 cm) in the Qufu Normal University’s greenhouse (length of lightening: 12 h; day/night temperature: 28 ℃/22 ℃; moisture: 40%) under normal soil. Young flag leaves from each biological replica were harvested 60 days after germination and were mixed together in equal quantities for RNA extraction.
Figure 6. Biological functions of the top five highly expressed unigenes in O. officinalis leaf transcriptome.

Illumina RNA-Sequencing

  • Total RNA was extracted from leaf tissues using the Trizol method (Invitrogen). RNA concentration and quality were assessed by analyzing 1 μL of the RNA sample on an Agilent Technology 2100 Bioanalyzer. The RNA library was constructed using a TruSeq RNA Sample Preparation Kit (RS-122-2001, Illumina) according to the manufacturer’s protocols.
  • Cluster formation and sequencing were performed on the Illumina HiSeq2000 platform following the manufacturer cBot and sequencing protocols.

Research Findings

  • In this study, approximately 23 million high-quality reads with nucleotide sequences totaling 2,131,363,516 bp were obtained, and each read was 100 bp in length. All of the high-quality reads were de novo assembled using Trinity software [16], because of the absence of an O. officinalis reference genome. The nonredundant assembly resulted in 68,132 unigenes with a total length of 83,266,858 bp and an average length of 1222 bp. The single assembly length ranged from 201 bp to 13,067 bp.The majority of the assemblies (36%) were 200–500 bp, and 20% of the assemblies were longer than 2,000 bp. The remaining assemblies fell into 500–1,000 bp, 1,000–1,500 bp, and 1,000–2,000 bp ranges and represented 19%, 14%, and 11% of the assemblies, respectively.
Sg8-RNA-Seq-2015-25713814-4.png
  • Functional annotations of the unigenes were per- formed using BLAST comparisons with different databases. Of the 68,132 unigenes, significant hits at the nucleotide level were obtained for 65,303 (96%) unigenes using the annotated sequences deposited in the nonredundant nucleotide database. When comparing protein coding sequences only, 77% of the unigenes had significant BLAST results for the nonredundant protein database; and 48% met a slightly strict standard in the SWISS-PROT protein database.
  • Of the 68,132 unigenes, 23,568 were assigned to at least one of the three biological domains (18,364 for molecular function, 30,646 for biological process, and 33,608 for cellular component) and 34 GO subcategories (Figure 3). In addition, a relatively higher proportion of unigenes were grouped into the following GO annotations: "binding" (14,485 unigenes), "catalytic activity" (12,396 unigenes), "metabolic process" (12,287 unigenes), "cellular process" (10746 unigenes), "cell" (6,595 unigenes), and "cell part" (6,595 unigenes).
  • Approximately 70% (47,564) of the unigenes were matched to homologs in the KEGG database and 22% (10,476) of those could be mapped to at least one biological pathway. These pathways were mainly associated with five categories, “metabolism” (9,420 unigenes), “genetic information processing” (6,884 unigenes), “environ mental information processing” (1,227 unigenes), “cellular processes” (1,300 unigenes), and “human diseases” (1,166 unigenes). The specific pathways, including “spliceosome,” “purine metabolism,” “ribosome,” “RNA transport,” “starch and sucrose metabolism,”

Labs working on this Project

  • School of Life Science, Qufu Normal University, Qufu, Shandong 273165, China.