Haplotype block partitioning and tagSNP selection under the perfect phylogeny model

Document Type: Research Paper


1 Faculty of Mathematical Sciences, Shahid Beheshti University, G.C., P.O. Box 198396-3113, Tehran, I.R. Iran.

2 School of Mathematics, Statistics and Computer Sciences, Center of Excellence in Biomathematics, College of Science, University of Tehran, P.O. Box 6455-14155, Tehran, I.R. Iran.

3 School of Computer Science, Institute for Research in Fundamental Sciences, P.O. Box 19395-5746, Tehran, I.R. Iran.


Single Nucleotide Polymorphisms (SNPs) are the most usual form of polymorphism in human genome.
Analyses of genetic variations have revealed that individual genomes share common SNP-haplotypes. The
particular pattern of these common variations forms a block-like structure on human genome. In this work,
we develop a new method based on the Perfect Phylogeny Model to identify haplotype blocks using
samples of individual genomes. We introduce a rigorous definition of the quality of the partitioning of haplotypes into blocks and devise a greedy algorithm for finding the proper partitioning in case of perfect and
semi-perfect phylogeny. It is shown that the minimum number of tagSNPs in a haplotype block of Perfect
Phylogeny can be obtained by a polynomial time algorithm. We compare the performance of our algorithm
on haplotype data of human chromosome 21 with other previously developed methods through simulations.
The results demonstrate that our algorithm outperforms the conventional implementation of the Four
Gamete Test approach which is the only available method for haplotype block partitioning based on
Perfect Phylogeny.