GAAP is a cGOF (core-gene-defined Genome-organization-framework) Assisted Assembly Pipeline. It is aimed at scaffolding and extending scaffolds and contigs based on de novo assembly of one paired-end library and core gene cluster from multiple related references.
GAAP is composed of two separate yet sequential sections:
1) cGOF_identification, which extracts sequences and order & orientation of cGOF segments from references; one species run once.
2) Scaffolding, which uses segments of cGOF genes as anchors to order the target scaffolds and contigs, uses paired-end reads mapping for local scaffolding of ordered scaffolds/contgis to recover more contigs, and then matches the closest organized reference to construct a pseudogenome; one target run once.
Documentation and usage information can be found here.
The framework and algorithm of GAAP are shown below figure.
Please send bugs or advice to the author(yuanlinas@163.com),thank you.