GAEP: a comprehensive genome assembly evaluating pipeline.

Yong Zhang, Hong-Wei Lu, Jue Ruan
Author Information
  1. Yong Zhang: Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China.
  2. Hong-Wei Lu: State Key Laboratory of Rice Biology and Breeding, China National Rice Research Institute, Chinese Academy of Agricultural Sciences, Hangzhou, Zhejiang 311401, China.
  3. Jue Ruan: Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China. Electronic address: ruanjue@caas.cn.

Abstract

With the rapid development of sequencing technologies, especially the maturity of third-generation sequencing technologies, there has been a significant increase in the number and quality of published genome assemblies. The emergence of these high-quality genomes has raised higher requirements for genome evaluation. Although numerous computational methods have been developed to evaluate assembly quality from various perspectives, the selective use of these evaluation methods can be arbitrary and inconvenient for fairly comparing the assembly quality. To address this issue, we have developed the Genome Assembly Evaluating Pipeline (GAEP), which provides a comprehensive assessment pipeline for evaluating genome quality from multiple perspectives, including continuity, completeness, and correctness. Additionally, GAEP includes new functions for detecting misassemblies and evaluating the assembly redundancy, which performs well in our testing. GAEP is publicly available at https://github.com/zy-optimistic/GAEP under the GPL3.0 License. With GAEP, users can quickly obtain accurate and reliable evaluation results, facilitating the comparison and selection of high-quality genome assemblies.

Keywords

MeSH Term

Software
High-Throughput Nucleotide Sequencing
Genome
Sequence Analysis, DNA
Genomics
Computational Biology
Humans

Word Cloud

Created with Highcharts 10.0.0qualitygenomeassemblyAssemblyevaluationGAEPpipelineevaluatingsequencingtechnologiesassemblieshigh-qualitymethodsdevelopedperspectivescanGenomecomprehensiveredundancyMisassemblyrapiddevelopmentespeciallymaturitythird-generationsignificantincreasenumberpublishedemergencegenomesraisedhigherrequirementsAlthoughnumerouscomputationalevaluatevariousselectiveusearbitraryinconvenientfairlycomparingaddressissueEvaluatingPipelineprovidesassessmentmultipleincludingcontinuitycompletenesscorrectnessAdditionallyincludesnewfunctionsdetectingmisassembliesperformswelltestingpubliclyavailablehttps://githubcom/zy-optimistic/GAEPGPL30LicenseusersquicklyobtainaccuratereliableresultsfacilitatingcomparisonselectionGAEP:metricsbreakpointdetection

Similar Articles

Cited By (3)