Browse - GSA - CNCB-NGDC

Experiment information

Accession

CRX045801

Organism

Title

CX10_1

BioProject

BioSample

Platform

Illumina MiSeq

Library

Library name	Construction protocol	Strategy	Source	Selection	Layout
	Raw sequence data was obtained by Illumina MiSeq 2×250 bp and processed using an in-house pipeline (http://zhoulab5.rccc.ou.edu:8080/root) which was built on the Galaxy platform and incorporated various software tools. The following steps are included: 1.Split libraies :This step performs demultiplexing of Fastq sequence data where barcodes and sequences are contained in two separate fastq files (common on Illumina runs). 2.Trim primers: To remove primers at the beginning of sequences. 3. Btrim: To minimize sequencing errors and ensure sequence quality, both forward and reverse reads were trimmed based on the sequence quality score using Btrim, Sequences were trimmed if the average quality score of 5 continuous bases was less than 20. Sequences less than 100 bases or contained undetermined bases, `N', were removed.4. FLASH: Paired end reads with sufficient overlap (minimum 50 base overlap between forward and reverse reads) were merged into full length sequences by FLASH v1.2.5. Reads that could not be joined were removed. 5. Trim by Sequence Length: Only sequences longer than the minimum length (247 bp) and shorter (or equal) than the maximum length (258 bp) are kept. 6. UPARSE: An operational taxonomic unit (OTU) table without singletons was generated by UPARSE at a 97% similarity level.	AMPLICON	METAGENOMIC	PCR	SINGLE

Processing

Planned read length (bp): 253

Release date

2019-12-31

Run

Run accession	Run data file information
Run accession	File name	File size (MB)
CRR050734	CRR050734.fastq.gz	1.7

Submitter

Ming Xue (xuemtc@163.com)

Organization

Guangdong Ocean University

Date submitted

2019-03-27

Related experiments

Experiments(95)