CandiHap A haplotype analysis toolkit for natural variation study

For Linux system (command lines)

Getting started

There are mainly three steps included in the CandiHap analytical through command lines, and the test data files can freely download at
Put test.gff, test.vcf, and genome.fa files in a same dir, then run:

     # 1. To annotate the vcf by ANNOVAR: 
     gffread  test.gff   -T -o test.gtf
     gtfToGenePred -genePredExt test.gtf  si_refGene.txt --format refGene --seqfile  genome.fa  si_refGene.txt --outfile si_refGeneMrna.fa  test.vcf  ./  --vcfinput --outfile  test  --buildver  si  --protocol refGene --operation g -remove

     # 2. To convert the txt result of annovar to hapmap format:
     perl  test.vcf  test.si_multianno.txt

Put and Phenotype.txt, Your.hmp, genome.gff files in a same dir, then run:

     # 3. To run CandiHaplotypes
     perl  -m Your.hmp  -f Genome.gff  -p Phenotype.txt  -g Your_gene_ID
e.g. perl  -m haplotypes.hmp  -f test.gff  -p Phenotype.txt  -g Si9g49990

If you want do analysis All gene in LD region of a position, please run:

     perl  -f genome.gff  -m ann.hmp  -p Phenotype.txt   -l LDkb  -c Chr:position
e.g. perl  -f test.gff  -m haplotypes.hmp  -p Phenotype.txt  -l 50kb  -c 9:54583294

