Uploading sequence mutations and corresponding sample metadata (file format can refer to the help block), haplotype network will be constructed via McAN (minimum-cost arborescence-based haplotype network). Based on constructed haplotype network, community lineages will be determined by Newman’s method.
Currently, it supports the network construction of SARS-CoV-2 and other species. Especially for SARS-CoV-2, you can freely select a specific dataset from Resource for Coronavirus 2019 (RCoV19)) as the background data.
We recommend mutation data in genovar format, the genovar format is a text file format. Each row represents a sample, contains sample name, accession ID, and mutations details split by ‘;’, and columns are separated by tab key.
Name | Accession | Mutations |
---|---|---|
hCoV-19/human/USA/TX-DSHS-000508/2020 | EPI_ISL_2264424 | 490(SNP:T->A);3177(SNP:C->T);6040(SNP:C->T);6843(SNP:C->T);8782(SNP:C->T);8950(SNP:C->T);12478(SNP:G->A);18736(SNP:T->C);24034(SNP:C->T);26729(SNP:T->C);26801(SNP:C->T);28077(SNP:G->C);28144(SNP:T->C);28896(SNP:C->G);29451(SNP:C->T);29700(SNP:A->G) |
hCoV-19/human/USA/TX-DSHS-000511/2020 | EPI_ISL_2264432 | 3003(SNP:A->T);8782(SNP:C->T);10811(SNP:C->T);10813(SNP:T->A);17747(SNP:C->T);17858(SNP:A->G);18060(SNP:C->T);24694(SNP:A->T);28144(SNP:T->C) |
hCoV-19/human/USA/TX-DSHS-000513/2020 | EPI_ISL_2264434 | 3003(SNP:A->T);8782(SNP:C->T);10811(SNP:C->T);10813(SNP:T->A);17747(SNP:C->T);17858(SNP:A->G);18060(SNP:C->T);24694(SNP:A->T);28144(SNP:T->C) |
hCoV-19/human/USA/TX-DSHS-000515/2020 | EPI_ISL_2264437 | 241(SNP:C->T);3037(SNP:C->T);8664(SNP:C->T);14408(SNP:C->T);15026(SNP:C->T);15264(SNP:T->C);23403(SNP:A->G);27575(SNP:C->T) |
hCoV-19/human/USA/TX-DSHS-000502/2020 | EPI_ISL_2264410 | 241(SNP:C->T);1059(SNP:C->T);3037(SNP:C->T);3068(SNP:G->A);9169(SNP:C->T);14408(SNP:C->T);23403(SNP:A->G);25563(SNP:G->T) |
VCF is a text file format. It contains meta-information lines, a header line, and then data lines each containing information about a position in the genome. The format also has the ability to contain genotype information on samples for each position.
##fileformat=VCFv4.1#CHROM | POS | ID | REF | ALT | QUAL | FILTER | INFO | FORMAT | SmapleA | SmapleB | SmapleC |
---|---|---|---|---|---|---|---|---|---|---|---|
RCoV2 | 490 | . | T | A | . | . | . | GT | AA | TT | TT |
RCoV2 | 3003 | . | A | T | . | . | . | GT | AA | TT | TT |
RCoV2 | 8782 | . | C | T | . | . | . | GT | TT | TT | TT |
RCoV2 | 10811 | . | C | T | . | . | . | GT | CC | TT | TT |
RCoV2 | 17747 | . | C | T | . | . | . | GT | CC | TT | TT |
Sample metadata in tabular tab-delimited text file format is need, example like:
Accession | Sampling Date | Sampling Location |
---|---|---|
EPI_ISL_2249479 | 2020-04-13 | United States |
EPI_ISL_2254726 | 2020-04-24 | Slovakia |
EPI_ISL_2270090 | 2020-05-11 | Switzerland |
EPI_ISL_2274027 | 2020-05-14 | Haiti |
EPI_ISL_2278820 | 2020-04-27 | Sweden |