Gene Expression Nebulas
A data portal of transcriptomic profiles analyzed by a unified pipeline across multiple species

Gene Expression Nebulas

A data portal of transcriptome profiles across multiple species

PRJNA309972: Batch effects and the effective design of single-cell gene expression studies

Source: NCBI / GSE77288
Submission Date: Jan 27 2016
Release Date: Jul 08 2016
Update Date: May 15 2019

Summary: Single cell RNA sequencing (scRNA-seq) can be used to characterize variation in gene expression levels at high resolution. However; the sources of experimental noise in scRNA-seq are not yet well understood. We investigated the technical variation associated with sample processing using the single cell Fluidigm C1 platform. To do so; we processed three C1 replicates from three human induced pluripotent stem cell (iPSC) lines. We added unique molecular identifiers (UMIs) to all samples; to account for amplification bias. We found that the major source of variation in the gene expression data was driven by genotype; but we also observed substantial variation between the technical replicates. We observed that the conversion of reads to molecules using the UMIs was impacted by both biological and technical variation; indicating that UMI counts are not an unbiased estimator of gene expression levels. Based on our results; we suggest a framework for effective scRNA-seq studies.

Overall Design: We combined the 96 single cell samples from each C1 chip into their own master mix and sequenced across three lanes of a HiSeq 2500 (3 individuals x 3 replicates x 96 wells x 3 lanes = 2592 files). We prepared two separate library preparations for each bulk sample; combined them all into one master mix; and sequenced across four lanes (3 individuals x 3 replicates x 2 library preparations x 4 lanes = 72 files).

GEN Datasets:
GEND000202 GEND000344
Strategy:
Species:
Healthy Condition:
Cell Type:
Protocol
Growth Protocol: Undifferentiated feeder-free iPSCs generated from Yoruba LCLs were grown in E8 medium (Life Tech) (G. Chen et al. 2011) on Matrigel-coated tissue culture plates with daily media feeding at 37 °C with 5% (vol/col) CO2. For standard maintenance, cells were split every 3-4 days using cell release solution (0.5 mM EDTA and NaCl in PBS) at the confluence of roughly 80%. For the single cell suspension, iPSCs were individualized by Accutase Cell Detachment Solution (BD) for 5-7 minutes at 37 °C and washed twice with E8 media immediately before each experiment. Cell viability and cell counts were then measured by the Automated Cell Counter (Bio-Rad) to generate resuspension densities of 2.5 X 105 cells/mL in E8 medium for C1 cell capture.; Undifferentiated feeder-free iPSCs generated from Yoruba LCLs were grown in E8 medium (Life Tech) (G. Chen et al. 2011) on Matrigel-coated tissue culture plates with daily media feeding at 37 °C with 5% (vol/col) CO2. For standard maintenance, cell were split every 3-4 days using cell release solution (0.5 mM EDTA and NaCl in PBS) at the confluence of roughly 80%. For the single cell suspension, iPSCs were individualized by Accutase Cell Detachment Solution (BD) for 5-7 minutes at 37 °C and washed twice with E8 media immediately before each experiment. Cell viability and cell counts were then measured by the Automated Cell Counter (Bio-Rad) to generate resuspension densities of 2.5 X 105 cell/mL in E8 medium for C1 cell capture.
Treatment Protocol: -
Extract Protocol: Single cell loading and capture was performed following the Fluidigm manual; A bulk sample, a 40 ul aliquot of ~10,000 cell, was collected in parallel with each C1 chip using the same reaction mixes following the C1 protocol of ""Tube Controls with Purified RNA
Library Construction Protocol: For sequencing library preparation, fragmentation and isolation of 5^ fragments were performed according to the UMI protocol (Islam et al. 2014). Instead of using commercial available Tn5 transposase, Tn5 protein stock was freshly purified in house using the IMPACT system (pTXB1, NEB) following the protocol previously described (Picelli et al. 2014). The activity of Tn5 was tested and shown to be comparable with the EZ-Tn5-Transposase (Epicentre). Importantly, all the libraries in this study were generated using the same batch of Tn5 protein purification. For each of the bulk samples, two libraries were generated using two different indices in order to get sufficient material.
Sequencing
Molecule Type: poly(A)+ RNA
Library Source:
Library Layout: SINGLE
Library Strand: Reverse
Platform: ILLUMINA
Instrument Model: Illumina HiSeq 2500
Strand-Specific: Specific
Samples
Basic Information:
Sample Characteristic:
Biological Condition:
Experimental Variables:
Protocol:
Sequencing:
Assessing Quality:
Analysis:
Data Resource GEN Sample ID GEN Dataset ID Project ID BioProject ID Sample ID Sample Name BioSample ID Sample Accession Experiment Accession Release Date Submission Date Update Date Species Race Ethnicity Age Age Unit Gender Source Name Tissue Cell Type Cell Subtype Cell Line Disease Disease State Development Stage Mutation Phenotype Case Detail Control Detail Growth Protocol Treatment Protocol Extract Protocol Library Construction Protocol Molecule Type Library Layout Strand-Specific Library Strand Spike-In Strategy Platform Instrument Model Cell Number Reads Number Gbases AvgSpotLen1 AvgSpotLen2 Uniq Mapping Rate Multiple Mapping Rate Coverage Rate
Publications
Batch effects and the effective design of single-cell gene expression studies.
Scientific reports . 2017-01-03 [PMID: 28045081]