Signal processing approaches as novel tools for the clustering of N-acetyl-β-D-glucosaminidases


1 Department of Plant Protection, Faculty of Agriculture, Ferdowsi University of Mashhad, P.O. Box 1163, Mashhad, I.R. Iran

2 Faculty of Agriculture, Shahrood University of Technology, P.O. Box 316, Shahrood, I.R. Iran.

3 School of Mining, Petroleum and Geophysics Engineering, Shahrood University of Technology, P.O. Box 316, Shahrood, I.R. Iran.


Nowadays, the clustering of proteins and enzymes in particular, are one of the most popular topics in bioinformatics. Increasing number of chitinase genes from different organisms and their sequences have been
identified. So far, various mathematical algorithms for the clustering of chitinase genes have been used but
most of them seem to be confusing and sometimes insufficient. In the present study, as a first step, different
amino acids participating in panoply of chitinases, as a model protein, obtained from the NCBI GenBank,
were digitized. Digitized data were normalized to the signal energy. Normalized data decomposed using
mother wavelet bior 5.5 to approximation (a1) and details (d1), at the first level. Corresponded coefficients
have been obtained and cross correlation between normalized, a1 and d1 coefficients of amino acid
sequences were calculated. Maximum correlation was selected as similarity index and corresponded cladogram
trees were made. The results of this study showed that more optimal and reliable cladogram tree
can be produced and better discrimination observed from d1 coefficients compared to normalized
sequences and opposed to a1 coefficients. Using suggested approach, the cladogram tree made from d1
coefficients not only had more validity but also the drawback of the classic cladogram tree has been improved.