目前,GenBaseTools为通用序列和新冠序列的序列校验提供了两个独立工具。
Usage: gbt <COMMAND>
Commands:
seqvalcom Validate common sequences
seqvalcovid Validate COVID19 sequences
help Print this message or the help of the given subcommand(s)
Options:
-h, --help Print help
-V, --version Print version
使用以下命令行校验通用序列:
gbt seqvalcom common_seq.fsa -o val_out
使用以下命令行校验COVID-19序列:
gbt seqvalcovid covid_seq.fsa -o val_out
LOG->ERROR: Found 7 errors
LOG->ERROR: Found 0 warnings
val_out.error.txt
val_out.warning.txt
Error Type | Message |
---|---|
Nucleotide | Found invalid char '@' at Line 2, Column 18 |
Nucleotide | Found invalid char '@' at Line 2, Column 19 |
Nucleotide | Found invalid char '@' at Line 2, Column 20 |
Nucleotide | Found invalid '>' at Line 2, Column 34 in sequence(seqid:'>ssss'). This symbol is not allowed in the sequence. Please check whether the new-line character is missing. |
Nucleotide | Found invalid '>' at Line 1164, Column 24 in sequence(seqid:'>Beijing-AAA-2022'). This symbol is not allowed in the sequence. Please check whether the new-line character is missing. |
Defline | Found duplicated sequence id: '>ssss' |
Defline | Found duplicated sequence id: '>asdf' |