The red algae (Rhodophyta), distributed in a diverse environment, are essential for understanding plant evolution and stress adaptation ( 1 ). Red algae contain different species such as multicellular and unicellular species, aquatic and semiterrestrial species, species with sexual and asexual reproduction, and species with alternating and nonalternating generations. Currently, Rhodophyta can be divided into two subphyla, seven classes, 760 genera and approximately 4410 species ( 2 ). Increasing studies showed that some red algae had unique biological functions with their utilization in industry ( 3 ) medicine ( 4 ) and food fields ( 5 ). The multicellular marine red alga, namely, Chondrus crispus, is rich in carrageenan and polysaccharides and has been found to reduce blood fat and cholesterol to some extent ( 6 ). Both Porphyridium purpureum and Galdieria sulphuraria are unicellular microalgae. The former’s natural phycobiliproteins and polyunsaturated fatty acids have been applied in the food-processing and healthcare fields ( 7 ). The latter’s metal ion absorption from wastewater and unique abiotic stress response mechanisms have been the research focus on extremophiles ( 8 ).
Endogenously produced microRNA (miRNA), with a length of ~21 nts, regulates gene expression posttranscriptionally ( 9 ). Studies demonstrated that miRNA derived from the non-protein-coding gene—MIRNA (MIR), acts as a negative regulator of target gene expression ( 10 ). Intracellular two splicing events with Dicer like enzymes, Argonature protein family and other factors degrade the primary miRNA (pri-miRNA) and precursor miRNA (pre-miRNA) into approximately 21-nt mature miRNA ( 11 ). Then, mature miRNA with an RNA-induced silencing complex inhibits corresponding target gene expression via cleavage or translation inhibition ( 12 ). Studies have showed that miRNA has diverse functions in growth, development, metabolism and evolution ( 13 ).
We have previously identified 503 miRNAs belonging to 469 MIR families in the 3 representative red algae described above by high-throughput sequencing combined with bioinformatics analysis ( 14 - 16 ). However, the detailed molecular information including pri-miRNA sequences, pre-miRNA structures, MIRs and their promoters, etc. are still unknown so far. Genome-wide characterization of miRNAs and MIRs in the 3 red algae have not been investigated. Unique phylogenetic relationship and evolutionary pattern of miRNAs in red algae have not been uncovered. The main inhibition type of red algae miRNA targets and their experimental validation need to be further explored. Potential key target genes and their pathways in the 3 red algae need to be mined.
We summarized and revealed the unique molecular and evolutionary characterization of miRNAs in the 3 red algae for the first time, which will provide an important reference for miRNAome research on other algae. Their putative targets experimental validation and key target genes functions and pathways prediction will lay a foundation for uncovering algae miRNA regulatory mechanism. Research results will be helpful to utilizing red algae resources at molecular level in future.
3. Materials and Methods
3.1. Prediction of Pri-MiRNA, MIR and MIR Promoter
Based on the known reference genomes information and our previous sRNA high-throughput sequencing data in C. crispus, G. Sulphuraria and P. purpureum, we obtained their intergenic (upstream and downstream area of miRNA ≤ 2000 bp) and intronic miRNAs. Then the pri-miRNA could be screened according to the following filtering conditions: (i) a harpin secondary structure could be constructed with a deletion of no more than 4-nt via running Mfold 6.5; (ii) the minimum free energy value of the structure should be ≤ -30 kcal/mol; and (iii) length of the structure should be at least 200 bp. Based on the upstream sequence (≤ 1800 bp) of pri-miRNA, MIR, MIR-promoter, TATA box and transcription start sites (TSSs) of MIRs were predicted via running Promoter Scan and Eponine software.
3.2. Genome-Wide Collinearity Analysis
The previous identified sRNAs, miRNAs and MIRs were mapped onto the reference genomes of C. crispus, P. purpureum and G. Sulphuraria via running Bowtie 2.0 and Circos software 0.55, respectively. Based on sequences alignment analysis, the homologous relationships of MIRs in the 3 red algae were constructed by Blast n search.
3.3. Statistical Analysis
The quantity of miRNAs including their common sequences, length distribution and nucleotide bias was analyzed statistically with SPSS and SAS software. The quantity of MIR156 family members in 11 representative plant species including the 3 red algae was calculated by querying the miRBase 22.0 database. Significance was estimated by p tests in SAS software. Windows and Linux systems were used for the Perl or R language environment.
3.4. Conservation and Divergence Analysis
Taken MIR156 for example, multiple sequences alignment of pre-miR156s and mature miR156s was performed using Clustal W 2.0. The identity percentages were analyzed by a Kolmogorov-Smirnov test in GeneDoc 2.7. The consensus secondary structure of pre-miR156s in plants, including the 3 red algae, was reanalyzed online via running Rfam 12.0.
3.5. Phylogenetic Analysis
Taken miR156 for example, the basic data for miR156s phylogenetic analysis involved all 140 precursors in plants including the 3 red algae. The codon positions included 1st+2nd+3rd+noncoding. All nucleotide positions above containing gaps and missing data were eliminated. A phylogenetic tree was constructed based on Bayes model of MEGA 5.0. The bootstrap consensus values were calculated from 1,000 replicates.
3.6. Evolutionary Rate Analysis
Taken miR156 for example, sequences of miR156s and their targets in plants and the 3 red algae were converted to an applicable data format according to the instructions of MEGA 5.0 and Clustal W 2.0 software. Due to heterogeneity in the relative evolutionary rates of different regions of pre-miRNAs, the K of the miR156s processed from the 5’ and 3’ ends were estimated using the baseml module in PAML. The Ka and Ks of miR156 targets were calculated by the y00 subroutine in PAML. Their neutral tests (Tajima’s D and Fu’s Fs) was estimated in Arlequin 3.11.
3.7. Cleavage Inhibition and RLM-RACE Validation
MiRNA target cleavage inhibition type was predicted by psRNATarget with no mismatches between 9 and 11 nt in the miRNA::target pair ( 17 ). Conversely, translation inhibition was predicted with at least one mismatch in the complementary region above. Total RNAs of the 3 red algae were isolated using TRIzol reagent. After the RNA purified and ligated to the RNA oligo adapter, RNA ligase-mediated rapid amplification of cDNA ends (RLM-RACE) PCR was performed using the FirstChoice®RLM-RACE kit. CDNA synthesis was performed with oligo (dT) primer and reverse transcriptase according to One Step RT-PCR kit description. The first cycle of amplification was performed with cDNA using the P1 and P2 primers to generate nonspecific 5’ RACE products. The second cycle of amplification was performed with the P3 and P4 primers to generate the final 5’ RACE products with three replicates. The sequencing was performed by Sangon Biotech Co., Ltd.
3.8. GO and KEGG Pathway Enrichment Analysis
GO terms assigned by Blast2GO 3.1 were enriched with GO Enrichment Analysis Software Toolkit. The enriched terms (p ≤ 0.05) were categorized into three function groups with GO Consortium and AmiGO: biological processes (BP), molecular functions (MF), or cellular components (CC). Then the common GO terms in the 3 red algae were screened. Final top 17 significant ones (p ≤ 0.01) were further screened. Similar to the above analysis, the top 24 significantly enriched common KEGG pathways (p ≤ 0.01) in the 3 red algae were illustrated with KegSketch software. KO terms were obtained in KOBAS 2.0.
4.1. Molecular and Collinear Characterization of MiRNAs in the 3 Red Algae
Five common miRNAs, including miR2916, miR4414b, miR6173, miR6194 and miR7536a, were detected among 503 miRNAs (429 conserved and 74 novel miRNAs) (Fig. 1A). Lengths of 21 and 24 nts were the most frequent (149) and least frequent ( 5 ) miRNA sizes, respectively (Fig. 1B). Except for the first base, the base proportion of these miRNAs was balanced, with an AU percentage of 52.42% (Fig. 1C). The first base was biased towards U. By investigating all miRNA precursor structures in the 3 red algae, we classified them into 4 types (Table S1A). Type I was the standard stem-loop structure with one closed effective loop at the top and one stem without any bulges or branched loops (Fig. S1A). Other types (II-IV) with diverse loops, stems, or bulges at different positions of the precursor are allosteric ones (Figs. S1B-D), which would lead to diverse mature miRNAs and targets ( 18 ). Furthermore, 19, 50, and 7 pri-miRNAs with length distribution ranged from 201 to 2886 nets were predicted in C. crispus, P. purpureum, and G. sulphuraria, respectively. Additionally, 47 pri-miRNAs were located in the intronic region of coding genes, and others were located in the intergenic region (Table S1B). Altogether, 34 MIR promoters and 27 TATA boxes were predicted in the 3 red algae (Table S1C).
Genome-wide collinearity analysis results showed that 4.9 M, 5.4 M and 2.4 M sRNAs were detected in the reference genomes of C. crispus (104.8M), P. purpureum (19.2M), and G. sulphuraria (13.6 M), respectively (Fig. 1D). However, much more miRNAs (254 miRNAs) were identified in P. purpureum, which indicated that lots of potential unknown miRNAs have not been identified in C. crispus and G. sulphuraria. Moreover, one miRNA high-expression region (84.8~85.6 M) was detected in the C. crispus reference genome. Three miRNA high-expression regions (11.2~12.8 M, 15.2~16.8 M and 19.2~20.0 M) were detected in the P. purpureum reference genome. Regrettably, no regions with significant miRNA high-expression were found in the G. sulphuraria reference genome. Moreover, the 186 homologous MIRs were detected between C. crispus and G. sulphuraria (Table S1D). Total 72 homologous MIRs were predicted between C. crispus and P. purpureum (Table S1E). Total 64 homologous MIRs were obtained between P. purpureum and G. sulphuraria (Table S1F).
4.2. Conservation and Divergence of 3 Red Algae MIR Families by Comparing Their MIR156s across Diverse Plant Species
Altogether, total 469 MIR families were predicted in the 3 red algae, and the majority of them had only one member (92.92% on average) (Table S2A). Based on the conserved chloroplast gene rbcL, a ML phylogenetic tree ( 19 ) of 11 representative plant species including the 3 red algae combined with 18 random MIR families was constructed (Fig. 2A). The distribution of family members was diverse in these species, which was irrelevant to the clustering of their host species. Furthermore, taken MIR156s for example, which could be detected in all typical plant species with 99 family members, both of 3 family members were founded in Physcomitrella patens (a transitional species between aquatic and land plants) and P. purpureum though the latter having a closer phylogenetic relationship with G. sulphuraria. Moreover, as two important food crops, Glycine max has the most 28 MIR156 members, whereas Sorghum bicolor has only nine.
Based on 294 known MIR156 matures reported in 47 plant species (Table S2B), we found that most of them were processed from the 5’ end of the precursors (accounting for 84.01%), and 98 sequences were validated. Based on the Kolmogorov-Smirnov statistical test ( 19 ), the proportion of MIR156 precursors with >22% sequence identity was approximately 0.375 (Fig. 2B). The percentage identity of aligned MIR156 matures showed that ~90% sequence identity was ~0.375 (Fig. 2C). This result indicated that MIR156 matures should be more conserved than their precursors in plants including red algae. Furthermore, the consensus structures of matures and precursors of MIR156 showed their conserved or diverged nucleotide region (Fig. S2A, B). Moreover, their mature sequence alignments analysis demonstrated that miR156-5p is more conserved than miR156-3p (Fig. S2C).
4.3. Phylogeny and Evolution of 3 Red Algae MiR156s Across Diverse Plant Species
A MrBayes tree ( 20 ) of pre-miR156s in diverse plant species, including the 3 red algae (Table S3A), was divided into group 1 and group 2 (Fig. 3A). As a small group, group 1 comprised ten pre-miR156s. Ahy-pre-miR156b formed one cluster with a long branch, implying a faster evolutionary rate than in the other nine miR156s in group 1. As a large group, group 2 containing all remaining pre-miR156s in plants, branched into multiple subgroups or clades. Group 2 was divided into four subgroups, each comprising a variable number of pre-miR156s and branching off multiple times. As a large subgroup with 84 pre-miR156s, group 2 (I) produced many branches, which comprised three pre-miR156s in G. sulphuraria and C. crispus. In group 2 (II), four pre-miR156s in P. purpureum and G. sulphuraria and another 6 pre-miR156s in five different species constructed one branch. Group 2 (III), which consisted of 18 pre-miR156s, formed 4 branches that shared one common ancestor. Similarly, in group 2 (IV), the precursors gma-pre-miR156d/i and mtr-pre-miR156e shared a common ancestor with cpa-pre-miR156f and nta-pre-miR156g.
We selected miR156s (including miR156s-5p/3p and pre-miR156s) and their targets as examples for evolutionary rates estimating ( 21 ). The overall trends of nucleotide divergence (K) ( 22 ) of miR156 matures and precursors were similar between red algae and other plants with the negative selected pressure (values of Tajima’s D and Fu’s Fs <0 significantly) (Fig. 3B, Table S3B). However, the sequences processed from the 3’ end vary greatly, with faster evolutionary rates (higher K values) in the 3 red algae than in other plants (lower K values). Moreover, the level of synonymous substitution (Ks) in miR156 targets was lower than that of nonsynonymous substitution (Ka) in both red algae and other plants (Fig. 3C), which indicated that some positive selection events happened during the evolution of miR156 targets in the plant kingdom. However, both Tajima’s D and Fu’s Fs ( 23 ) of miR156 targets in the 3 red algae were less than zero, indicating that some negative selection events happened.
4.4. Prediction and Validation of MiRNA Target Inhibition Type in the 3 Red Algae
Based on our previous predicted 18723 de-redundant miRNA targets in the 3 red algae, we speculated that cleavage was the main miRNA target inhibition type (accounting for 68% on average) (Fig. 4A).
We selected ccr-miR156, ppu-miR156g, and gsu-miR156a and their corresponding putative targets to validate the cleavage inhibition type in the 3 red algae. Their RLM-RACE results showed that the first PCR amplification with a pair of adapter primers (P1and P2 seen in Table S4), no any specific miR156 cleaved products could be detected (Fig. 4B, Fig. S3). After the second PCR amplification with a 5’ nested adaptor forward primer (P3) and a gene-specific reverse primer (P4) (Table S4), we obtained the desired specific targeted products (721 bp for ccr-miR156, 399 bp for gsu-miR156a, 267 bp for ppu-miR156g) (Figs. 4C-E). Moreover, sequences alignment analysis combined with products sequencing data confirmed that the 3 targeted mRNAs (CHC_T00006037001 for ccr-miR156, XM_005703296.1 for gsu-miR156a, HS828387.1 for ppu-miR156g) were cleaved by miR156s with 10th position as the cleavage sites (Figs. 4F-H).
4.5. The Common GO and KEGG Pathway Enrichment
A total of 17 significant enriched common GO terms of miRNA target genes (p-value ≤ 0.01) in the 3 red algae were screened (Fig. 5A, Figs. S4A-C). Their categories quantity order was MF ( 8 ) > CC ( 6 ) > BP ( 3 ). The highly enriched target genes (genes number=10034, p-value=0.0013) function was plastid formation (GO:0009536) (Table S5A). Furthermore, 24 significant enriched common KEGG pathways miRNA target genes involved in (p-value ≤ 0.01) were screened in the 3 red algae (Fig. 5B, Fig. S4D). Among the pathways, metabolic pathways (ko01100) was highly enriched (totals of 30262 genes, p- value=0.0078) especially for P. purpureum (Table S5B).
Although a number of miRNAs were identified in the 3 red algae, many potential miRNAs have not been founded, which could be speculated through fewer common miRNAs and miRNAs distribution on the reference genomes. Moreover, most of the pre-miRNAs (73.68%) predicted in the intergenic regions indicated that the 3’ UTR should not be the main miRNA mapped region in red algae. Lots of homologous MIRs detected on the reference genomes indicated that the non-coding-gene duplication event might occur in the evolutionary processes of red algae.
MIR family divergence could be reflected in the following two aspects: on the one hand, the diverse member numbers were revealed in this study with the most 146 members in MIR169 and the fewest 2 in MIR5562. On the other hand, the precursor was more diverse than the mature with the miR156s-5p played a leading role in posttranscriptional gene control. As an essential regulator in the processes of plant growth and development ( 24 ), miR156 matures are relatively conserved than pre-miR156s (complex phylogenetic relationships and different evolutionary rates) in different plants including red algae. But red algal Precursor’s evolutionary rates were faster than other higher plants. Although inconsistent miR156 targets evolutionary parameters predicted, we could reveal the evolution of miR156s and pre-miR156s in plants including the 3 red algae was steady with negative selected pressure. But their neutral evolutionary event to be omitted could not be confirmed based on fewer species and no significant data tests.
Three putative miR156 targets and their cleavage sites in the 3 red algae were validated, which indicated that cleavage should be the main miRNA target inhibition type in red algae. Next, we will determine detailed target gene information and validate their function in the red algae. Additionally, the highly enriched target genes (GO:0046872) closely related to the synthesis of plastids will provide the important information for synthesis and application of red algal pigments especially for P. purpureum, whose B-phycobiliprotein is an important food colorant and a new tool used for fluorescence labeling ( 25 ). Due to the absence of explicit red algae metabolism annotation information in KEGG, the significantly enriched metabolic pathways (ko01100) without the detailed reference map need to be further researched at the metabolic level.
We profiled unique molecular features of miRNA and MIR families in C. crispus, G. sulphurariais and P. purpureum comprehensively via comparing their sequences information across other plants. Taken red algae MIR156s including family members, miR156 precursor, mature miR156, and miR156 target for example, we further revealed red algal phylogenetic and evolutionary characterization. MiR156 targets were validated in the 3 red algae, and the common GO and KEGG pathways of target genes were enriched. These results will lay a foundation for elucidating unique characteristics in algae and provide insights that can be used to further explore and utilize red algae resources.
The authors would like to thank AJE for the editorial assistance with the English. All data used in this study were available in the supplementary files. This work was supported by the National Natural Science Foundation of China (Grant No. 31670208), Applied Basic Research Programs of Shanxi Province of China (Grant No. 201801D221242) and Technological Innovation Programs of Higher Education Institutions in Shanxi of China (Grant No. 2019L0041) and the Shanxi “1331 Project”.
Data Availability Statement
The raw sRNA sequencing data have been deposited in the NCBI SRA database with the accession numbers: SRP066538/SRS1172958/SRX1445023/SRR3228749 for C. crispus, SRP071761/SRR3228731/ SRS1338707/SRX1631643 for P. purpureum and SRP071954/SRR3 234543/SRS1359098/SRX1640152 for G. sulphuraria.
- Brawley SH, Blouin NA, Ficko-Blean E, Wheeler GL, Lohr M, Goodson HV, et al. Insights into the red algae and eukaryotic evolution from the genome of Porphyra umbilicalis (Bangiophyceae, Rhodophyta). Proc Natl Acad Sci U S A. 2017; 114(31): e6361-e6370. DOI
- Yoon HS, Müller K, Sheath RG, Ott FD, Bhattacharya D. Defining the major lineages of red algae (Rhodophyta). J Phycology. 2006; 42(2):482-492. DOI
- Ju X, Igarashi K, Miyashita S, Mitsuhashi H, Inagaki K, Fujii S, et al. Effective and selective recovery of gold and palladium ions from metal wastewater using a sulfothermophilic red alga, Galdieria sulphuraria. Bioresour Technol. 2016; 211:759-764. DOI
- Singh S, Ar Ad SM, Richmond A. Extracellular polysaccharide production in outdoor mass cultures of Porphyridium sp. in flat plate glass reactors. J Appl Phycolog. 2000; 12(3):269-275. DOI
- Azanza A. Advances in cultivation technology of commercial eucheumatoid species: a review with suggestions for future research. Aquaculture. 2002; 206:257-277. DOI
- Kulshreshtha G, Burlot AS, Marty C, Critchley A, Hafting J, Bedoux G, et al. Enzyme-assisted extraction of bioactive material from Chondrus crispus and Codium fragile and its effect on herpes simplex virus (HSV-1). Mar Drugs. 2015; 13(1):558-580. DOI
- Dufossé L, Galaup P, Yaron A, Arad SM, Blanc P, Murthy K, et al. Microorganisms and microalgae as sources of pigments for food use: a scientific oddity or an industrial reality?. Trends Food Sci Technol. 2005; 16(9):389-406. DOI
- Martinez-Garcia M, Stuart MC, van der Maarel MJ. Characterization of the highly branched glycogen from the thermoacidophilic red microalga Galdieria sulphuraria and comparison with other glycogens. Int J Biol Macromol. 2016; 89:12-18. DOI
- Ma Y, Yu Z, Han G, Li J, Anh V. Identification of pre-microRNAs by characterizing their sequence order evolution information and secondary structure graphs. BMC Bioinformatics. 2018; 19(Suppl 19):521. DOI
- Liu T, Fang C, Ma Y, Shen Y, Li C, Li Q, et al. Global investigation of the co-evolution of MIRNA genes and microRNA targets during soybean domestication. Plant J. 2016; 85(3):396-409. DOI
- Yu Y, Jia T, Chen X. The ‘how’ and ‘where’ of plant microRNAs. New Phytol. 2017; 216(4):1002-1017. DOI
- Thomou T, Mori MA, Dreyfuss JM, Konishi M, Sakaguchi M, Wolfrum C, et al. Adipose-derived circulating miRNAs regulate gene expression in other tissues. Nature. 2017; 542(7642):450-455. DOI
- Chung BY, Deery MJ, Groen AJ, Howard J, Baulcombe DC. Endogenous miRNA in the green alga Chlamydomonas regulates gene expression through CDS-targeting. Nat Plants. 2017; 3(10):787-794. DOI
- Gao F, Nan F, Feng J, Lv J, Liu Q, Xie S. Identification of conserved and novel microRNAs in Porphyridium purpureum via deep sequencing and bioinformatics. BMC Genomics. 2016; 17(1):612. DOI
- Gao F, Nan F, Song W, Feng J, Lv J, Xie S. Identification and Characterization of miRNAs in Chondrus crispus by high-throughput sequencing and bioinformatics analysis. Sci Rep. 2016; 6:26397. DOI
- Gao F, Nan F, Feng J, Lv J, Liu Q, Xie S. Identification and characterization of microRNAs in Eucheuma denticulatum by high-throughput sequencing and bioinformatics analysis. RNA Biol. 2016; 13(3):343-352. DOI
- Dai X, Zhao PX. psRNATarget: a plant small RNA target analysis server. Nucleic Acids Res. 2011; 39(Web Server issue):W155-159. DOI
- Ganie SA, Debnath AB, Gumi AM, Mondal TK. Comprehensive survey and evolutionary analysis of genome-wide miRNA genes from ten diploid Oryza species. BMC Genomics. 2017; 18(1):711. DOI
- Barik S, Kumar A, Sarkar Das S, Yadav S, Gautam V, Singh A, et al. Coevolution Pattern and Functional Conservation or divergence of miR167s and their targets across diverse plant species. Sci Rep. 2015; 5:14611. DOI
- Darriba D, Flouri T, Stamatakis A. The state of software for evolutionary biology. Mol Biol Evol. 2018; 35(5):1037-1046. DOI
- Morea EG, da Silva EM, e Silva GF, Valente GT, Barrera Rojas CH, Vincentz M, et al. Functional and evolutionary analyses of the miR156 and miR529 families in land plants. BMC Plant Biol. 2016; 16:40. DOI
- Zhao M, Meyers BC, Cai C, Xu W, Ma J. Evolutionary patterns and coevolutionary consequences of MIRNA genes and microRNA targets triggered by multiple mechanisms of genomic duplications in soybean. Plant Cell. 2015; 27(3):546-562. DOI
- Zuykova EI, Bochkarev NA, Taylor DJ, Kotov AA. Unexpected endemism in the Daphnia longispina complex (Crustacea: Cladocera) in Southern Siberia. PLoS One. 2019; 14(9):e0221527. DOI
- Liu J, Cheng X, Liu P, Sun J. miR156-Targeted SBP-box transcription factors interact with DWARF53 to regulate TEOSINTE BRANCHED1 and BARREN STALK1 expression in bread wheat. Plant Physiol. 2017; 174(3):1931-1948. DOI
- Voznesenskiy SS, Popik AY, Gamayunov EL, Orlova TY, Markina ZV, Postnova IV, et al. One-stage immobilization of the microalga Porphyridium purpureum using a biocompatible silica precursor and study of the fluorescence of its pigments. Eur Biophys J. 2018; 47(1):75-85. DOI