Within the ordinary genetic codes exploited in a large amount of varied ways, all amino acids (aa) are coded by two to six synonymous codons, except Met and Trp. However, degenerate codons are not employed at identical frequencies within organism. A phenomenon designated codon usage bias (1-3). Codon usage bias among synonymous codons has been documented for numerous genes in variant species (3-9). It is reported that synonymous codon usage bias may be connected with different biological factors (6, 10-13). Further analysis found that synonymous codon usage pattern changed at distinct sites along a coding sequence (14), balances of strong versus weak base pair bonding (15, 16), maintenance of DNA and RNA secondary structure (17), and translational efficiency and fidelity (6). Aujeszky’s disease, which is provoked by the pathogenic factor of Pseudorabies virus (PRV) (also known as Suid herpesvirus 1, SuHV-1), is a regularly lethal disease with a global distribution that influences swine, mainly and other domestic and wild animals incidentally (18-22). PRV belongs to the genus Varicellovirus, subfamily alpha herpesvirinae, which is a swine alpha herpesvirus (20, 23-26)
PRV UL31 gene, an 816-base pair sequence encodes a putative polypeptide of 271 aa residues named UL31 protein. Regarding the role of UL31 gene product played in the herpesvirus life cycle, herpes simplex virus 1 (HSV-1) UL31 (27-36) and Epstein-Barr virus (EBV) BFLF2 (37-39), the homologue of PRV UL31, have been extensively studied; however, the precise function propers of PRV UL31 gene, as well as its codon usage bias is weakly understood.
3. Materials and Methods
3.1. Virus Species and Gene Sequences
The nucleotide sequences of PRV Becker strain UL31 gene (GenBank accession no. JF797219) and the UL31-like genes of 48 reference herpesviruses were gained from the GenBank.
3.2. Molecular Phylogenetic Tree of UL31-Like Proteins of the 49 Reference Herpesviruses
To compare with those of UL31-like proteins of the 49 reference herpesviruses, then multiple sequence alignment and phylogenetic analysis (rooted tree) were carried out by employing the DNAStar (version 7.0, DNAStar, Inc.) (40).
3.3. Codon Usage Analysis of the PRV Becker Strain UL31 Gene and other 48 Reference Herpesviruses
For each gene, codon usage was estimated by using CAI, CHIPS and CUPS programs of EMBOSS. ENc, GC3s and RSCU were analyzed (41, 43). Values of ENc can range from 20 (when only one codon is used per aa) to 61 (when all synonyms are used with equal frequency). Thus, ENc can be a useful measure of general codon usage bias. The lower the ENc, the higher the codon bias. GC3s is a useful parameter of the extent of base composition bias, and stands for the frequency of the nucleotide G+C at the synonymous third position of codons, except for Met, Trp and the stop codons. A heat map to represent the clustering of RSCU values was generated by the CIMMiner software tool (44) with each row representing a specific codon and each column representing a different species. Clustering was accomplished based on Euclidean distance and the average linkage method. Curves were created using a logarithmic distribution curve where y = -18.564 Ln(x) + 36.503, y = 1.8179Ln(x) + 33.257 and y = 0.4539 Ln(x) - 2.4428 were used for calculating the points for ENc-GC3s, ENc-Length and GC3s-Length, respectively.
3.4. Statistical Analysis
The correlations between codon usage variations among the PRV UL31 gene and 48 reference herpesviruses and four indicators (CAI, ENc, GC3s and gene length) were estimated by using the SPSS 12.0 software package.
4.1. Molecular Phylogenetic tree of the UL31-Like Proteins in PRV Becker Strain and the Reference Herpesviruses
A phylogenetic tree as the basis of the deduced UL31 and its UL31-like proteins in the reference herpesviruses was generated. We can see that the proteins could be preliminary separated into different subfamilies, i.e. Alpha herpesvirinae, Beta herpesvirinae and Gamma herpesvirinae (20, 21). Simultaneously, it is shown that the UL31 of PRV Becker strain clusters with Bartha, Kaplan and Ea strains are initially placed in a monophyletic clade and then clustered with Bovine herpesvirus 1 (BoHV-1) and BoHV-5 of the genus Varicellovirus of subfamily alpha herpesvirus, sequently they clustered with other members of the reference species.
4.2. Codon Usage Analysis of the UL31 Gene in PRV Becker Strain and the Reference Herpesviruses
Codon usage in the PRV UL31 gene and its homologous genes is highly nonrandom. However, there are some diverse patterns in the codon usage bias parameters of the UL31 gene among the PRV Becker, Kaplan, Bartha and Ea strains. It can be seen in Table 1 that the CAI values of distinct herpesviruses vary from 0.602 to 0.842, with a mean value of 0.720 and a standard deviation (SD) of 0.062 and their ENc values range from 37.345 to 59.619, with a mean value of 45.644 and SD of 9.958. Compared to other species, the ENc values of different PRV strains are much lower (ENc < 40), the codon usage bias in the UL31 -like genes of 49 reference species, especially the PRV is therefore, slightly high. If a specific gene is exposed to G+C compositional restriction for shaping the codon usage pattern, it will lie on a continuous curve, representing random codon usage (45). The ENc values of each UL31 -like gene in the 49 reference herpesviruses are plotted against their corresponding GC3s in Figure 1.
Here, the plot of gene length against ENc (Figure 1 B) or against GC3s (Figure 1 C) shows the distribution for each gene. It seems that in the UL31-like genes of the 49 reference herpesviruses, shorter or longer genes both have a similar variance in ENc values and GC3s. It suggested that gene length may not play a role in shaping the codon usage bias of the 49 reference species. Similar results were also established in P. aeruginosa; duck plagued virus and SARS coronavirus (5, 46, 47).
4.3. Variation in the PRV Becker Strain UL31 Gene Codon Usage and aa Composition
While the CAI, ENc and the related measures present the overall codon bias of PRV UL31 gene. Table 2 shows the overall codon preference of the UL31 gene in the PRV Becker strain. Moreover, Cys, Asp, Glu, His, Ile, Lys, Asn, Gln and Tyr also have a high level of variety in codon usage bias, even though they only have two-fold or three-fold coding degeneracy. Altogether, although the most and the least frequencies' utilized codons of all the aa are disparate, the analyzed PRV Becker strain, UL31 gene discloses meaningful preference for one or more than one suppose codon for each aa. However, a similar bias also exists at the first position, indicating a more complicated situation exists in reality.
Fract refers to the proportion of all synonymous codons encoding the same amino acid. The frequency of each codon that appears in the coding sequence of the individual gene is 1/1000.
4.4. Phylogenetic Persistence in Codon Usage Bias of the PRV Becker Strain UL31 Gene
To provide a visual representation of the variation in codon bias (48-50), we carried out a cluster analysis (Figure 2) of the codon usage pattern on the basis of the PRV Becker strain UL31 gene and its 48 reference herpesviruses in accord with the RSCU values. From the figure, we can see that PRV Becker, Kaplan, Barthaand Ea strains appear different from other herpesviruses. They, firstly, cluster together and from a segregated branch, then they cluster with BoHV-1 and BoHV-5 of the genus Varicellovirus of subfamily alphaherpesvirus and Cercopithecine herpesvirus 2(CeHV-2) of the genus Simplexvirus of a subfamily alphaherpesvirus, subsequently they cluster with other members of the reference species. This consequence wholly indicates the internal relations of the codon usage pattern between PRV and other herpesviruses, particularly the alphaherpesviruses, suggesting that the codon usage pattern of PRV has distinctions with other members of the reference species, the more distant the genetic relationship, the bigger the expected variation in the codon usage bias, and vice versa. Consequently, we can conclude that the codon usage pattern of PRV is fairly close to that of the members of genus Varicellovirus of alphaherpesvirus.
4.5. Comparison of the UL31 Gene Codon Usage in PRV Becker Strain with those of E. coli, Yeast and Human
Generally, the codon usage bias in a gene remains conserved, to a certain extent, across species. Here, the codon usage of PRV Becker strain UL31 gene was compared with those of E. coli, yeast, and human to see which would be the most appropriate host for optimal expression. From Table 2 we can see that there are 33 codons showing a PRV-to-yeast ratio higher than 2 or lower than 0.50, and 24 codons showing a PRV-to-E. coli ratio higher than 2 or lower than 0.50, but 22 codons showing a PRV-to-human ratio higher than 2 or lower than 0.50, indicating that large diversities in the codon preferences exist for all three hosts. Although there were slightly fewer differences in codon usages between PRV and human, the difference is unlikely to be statistically significant, and experimental studies would be necessary to assess the most suitable expression system for this virus.
In this study, the data of synonymous codon usage bias exhibited certain different distinctions existed for each herpesvirus from different species, and the result exposed that: a. PRV Becker strain UL31 gene and its 48 reference herpesviruses adopt comparatively similar codon usage patterns; and b. the PRV Becker strain UL31 gene opts to employ the codons with C and G at the third codon position. Furthermore, the biased tendency towards C and G is consistent with the high C + G content in PRV Becker strain UL31 gene. Since the UL31 gene in the PRV Becker strain is a CG-rich gene, it is rational that C and/or G ending codons are prevalent in the gene. In order to show the codon usage variation. Table 1 shows that the UL31 genes in alpha herpesvirus member of PRV, BoHV-1, BoHV-5, Human herpesvirus 1 (HHV-1), HHV-2, CeHV-1, CeHV-2, CeHV-16 and Saimiriine herpesvirus 1 (SaHV-1), etc. whose natural host is mammalian, have a stronger correlation than other UL31 genes of the reference alpha herpesviruses with avian host, such as GaHV-1, GaHV-2, GaHV-3, Meleagrid herpesvirus 1 (MeHV-1) and Anatid herpesvirus 1 (AnHV-1 ). It is critical to clarify the fundamental mechanisms of codon usage pattern to perceive the evolution of the species (51, 52). From the phylogenetic tree (Figure 3) and cluster analysis results (Figure 1) we can see that PRV is evolutionarily closer with BoHV-1 and BoHV-5 than GaHV-1 and PsHV-1, etc. Simultaneously, its codon usage pattern is also closer with BoHV-1 and BoHV-5 than other members of the reference species. Accordingly, we can draw a conclusion that species has a certain effect to the preference of codon usage, but is less substantial than the influence of gene function, and the codon usage bias of PRV UL31 gene has a very close connection with its gene function.
Bioinformatic's analysis reveals that PRV UL31 protein is a member of PHA03328 superfamily (data not shown), which encodes nuclear egress lamina protein UL31 and is conserved throughout the herpesvirus. Although the biological characteristics of most of the herpesviral UL31 homologues are ill understood at the present time, a common property is the interaction of UL31 and UL34, and their co-localization at the nuclear or nuclear rim happened at different herpesvirus subfamilies, such as alpha herpesvirus HSV-1 (32) and HSV-2 (53), beta herpesvirus murine cytomegalovirus (MCMV) (54), and gammaherpesvirus EBV (38) and Kaposi's sarcoma-associated herpesvirus (KSHV) (55). Another interesting feature is their significance for primary envelopment and nuclear egress in all herpesvirus subfamily (32, 35, 39, 56). Therefore, because of the crucial roles acted by the counterpart of PRV UL31 in HSV, MCMV, EBV and KSHV in the course of infection, it indicates that PRV UL31 may also play a similar role in the process of infection according to their phylogenetic conservation. However, it is not yet known what real biological roles of UL31 have in the PRV life cycle, and the investigation of these aspects must therefore await further clarification of its functions in viral replication and the interactions between PRV and host.
Among the codon usage bias fashions in E. coli, yeast and human, no clear definition of the most appropriate host could be made. Although the codon usages between PRV and human were slightly better matched compared to the other hosts, they were not significantly different. Nevertheless, in a recent study, we successfully expressed the PRV UL31 protein in the human embryonic kidney 293T expression system (unpublished data).
Taken together, analysis of codon usage pattern of PRV UL31 gene and a comparison of codon preference between PRV UL31 gene and other species can offer a foundation for understanding the relevant mechanism of biased usage of synonymous codons.
There is no acknowledgment.
Implication for health policy/practice/research/medical education: These results may further our comprehending of the evolution, pathogenesis and functional studies of PRV, as well as contributing to the area of herpesvirus research or even studies with other viruses.
Authors’ contributions: MSC and MLL contributed equally to this study, both of them equally carried out most of the experiments and wrote the manuscript. MSC and MLL have critically revised the manuscript and the experimental design. JYZ, JHC, BYW and ZL helped in experiments. All the authors read and approved the final manuscript.
Funding/Support: This work was supported by grants from the Natural Science Foundation of Guangdong Province (S2013040016596); Science and Technology New Star in Zhu Jiang, Guangzhou City (2013J2200018); National Natural Science Foundation of China (31200120); Medical Scientific Research Foundation of Guangdong Province, China (B2012165); Foundation for Distinguished Young Talents in Higher Education of Guangdong, China; First Batch of Youth Learning Backbone Teacher in Guangzhou Medical University; and Students’ extracurricular scientific and technological activities in Guangzhou Medical University (2012A039 and 2012C007).
Financial Disclosure: The authors declare that they have no competing interests.