Screening and evaluation of key genes in contributing to pathogenesis of hepatic fibrosis based on microarray data
European Journal of Medical Research volume 25, Article number: 43 (2020)
Hepatic fibrosis (HF), which is characterized by the excessive accumulation of extracellular matrix (ECM) in the liver, usually progresses to liver cirrhosis and then death. To screen differentially expressed (DE) long non-coding RNAs (lncRNAs) and mRNAs, explore their potential functions to elucidate the underlying mechanisms of HF.
The microarray of GSE80601 was downloaded from the Gene Expression Omnibus database, which is based on the GPL1355 platform. Screening for the differentially expressed LncRNAs and mRNAs was conducted between the control and model groups. Then, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were performed to analyze the biological functions and pathways of the DE mRNAs. Additionally, the protein–protein interaction (PPI) network was delineated. In addition, utilizing the Weighted Gene Co-expression Network Analysis (WGCNA) package and Cytoscape software, we constructed lncRNA-mRNA weighted co-expression networks.
A total of 254 significantly differentially expressed lncRNAs and 472 mRNAs were identified. GO and KEGG analyses revealed that DE mRNAs regulated HF by participating in the GO terms of metabolic process, inflammatory response, response to wounding and oxidation–reduction. DE mRNAs were also significantly enriched in the pathways of ECM-receptor interaction, PI3K-Akt signaling pathway, focal adhesion (FA), retinol metabolism and metabolic pathways. Moreover, 24 lncRNAs associated with 40 differentially expressed genes were observed in the modules of lncRNA-mRNA weighted co-expression network.
This study revealed crucial information on the molecular mechanisms of HF and laid a foundation for subsequent genes validation and functional studies, which could contribute to the development of novel diagnostic markers and provide new therapeutic targets for the clinical treatment of HF.
Hepatic fibrosis (HF), a common health issues worldwide, usually progresses to liver cirrhosis, primary liver cancer and then results in death [1,2,3]. Despite the impact of HF, it is regretful about note that an optimal treatment has yet to be identified. The good news is that advances in molecular biotechnology and high-throughput technology have positioned us on the frontier of understanding the possible molecular mechanism of HF at the gene and protein levels [4,5,6]. However, a comprehensive understanding of the pathogenesis underlying HF still remains to be elucidated, due to its complexity, especially the complicated regulatory mechanisms of gene expression. Hence, understanding the underlying pathophysiology of HF, and the screening of molecular markers in the development of therapeutic targets remains critical.
More and more evidence has indicated that, rather than being transcriptional noise, lots of non-coding RNAs (ncRNAs) can affect the expression levels of target genes [7, 8]. Long non-coding RNAs (lncRNAs), one of a ncRNAs, which play critical roles in transcription, splicing and translation . However, up to now, compared with mRNAs, the function of lncRNAs has not been well annotated [10, 11]. Therefore, to research the roles of lncRNAs by studying the related to targeting genes such as mRNAs, whose functions have been known, have been certificated an efficient way in many kinds of diseases [12,13,14].
In the current study, we used microarray data to identify novel HF-related lncRNAs and mRNAs. Furthermore, bioinformatics technologies, such as Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses, protein–protein interaction (PPI) network and weighted gene co-expression network analysis (WGCNA), were applied to analyze the differentially expressed lncRNAs (DELs) and mRNAs (DEMs). The findings may provide in-depth molecular insight into the pathophysiology of HF.
Materials and methods
The gene expression data of GSE80601 was downloaded from the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/) database, which is based on the Affymetrix Gene Chip Mouse Exon 1.0 ST Array. It has a total of 10 data in the dataset, including 5 samples in the control group and 5 samples in the HF group, which were induced by carbon tetrachloride .
Gene chip probe re-annotation
A number of lncRNAs represented on the Affymetrix microarray were identified based on the lncRNAs classification pipeline constructed in previous research . First, we gained the latest version of the NetAffx Annotation File (MoEx-1_0-st-v1 Probeset Annotations, CSV Format, Release 36 93 MB, 7/6/16) from the Affymetrix official website. The annotation file was mapped to the MoEx-1_0-st-v1 probe sets ID. Second, among the probe sets from the Refseq database, the IDs beginning with ‘NP’ were wiped off, while the transcript IDs labeled with ‘NR’ were retained. For the probe sets from the Ensembl database, Affymetrix microarray IDs and the corresponding gene type were converted to Ensembl IDs by using the online software BioMart. The probe sets from NONCODE were reserved. Only genes that were annotated as ‘lincRNA’, ‘sense_intronic’, ‘processed_transcript’, ‘antisense’, ‘sense overlapping’, ‘3prime_overlapping_ncRNA’, and ‘misc_RNA’ were retained. Next, probe set IDs annotated as ‘microRNA’, ‘snoRNAs’, ‘pseudogenes’ and other small RNAs were removed.
Based on the annotation of GPL1355 platform, we converted the expression data of probes to the corresponding gene symbols. The average expression value, used for genes corresponding to multi-probes, was calculated using Aggregate function of R, the missing values of probes were added via the KNN method of Impute package of R . Next, using the preprocessCore package of R, quantile normalization was carried out . We used the above steps to get the expression matrix.
Identification of DELs and DEMs
After the raw data from the mRNA-lncRNA Affymetrix microarray were selected, we screened the differential probes by applying the Limma (linear models for microarray data) package in R via T-test with the screening criteria: fold Change (FC) > 1.5 and P value < 0.05. Then, the differential probes representing the DELs and DEMs were re-annotated. Based on these two steps, the DELs and DEMs were obtained.
GO and pathway enrichment analysis
GO biological processes term and KEGG pathway for DEMs were performed using Cytoscape-bingo and Database for Annotation, Visualization and Integration Discovery (DAVID, https://david.ncifcrf.gov/), respectively. Then, A FDR was calculated to correct the p-value and FDR < 0.05 was selected as the threshold.
Construction of PPI network
The online String database (Search Tool for the Retrieval of Interacting Genes, http://string-db.org/) was used to analyze the interactions of proteins. Genes without connection to other genes were removed, with a cut-off criterion of the combined score > 0.90. Then, the PPI network was delineated by Cytoscape (http://cytoscapeweb.cytoscape.org/).
Construction of lncRNA-mRNA WGCNA
The R package WGCNA was applied for network constructions . The steps of network construction were as follows: (1) Network construction: the lncRNA/mRNA weighted co-expression network is fully specified by its adjacency matrix amn, where amn encodes the network connection strength between nodes m and n. To calculate the adjacency matrix, the default approach defines the co-expression similarity Smn as the absolute value of the correlation coefficient between nodes of m and n: Smn = |cor am, an)|. The weighted adjacency amn between two genes is proportional to their similarity on a logarithmic scale with b30.8, log (amn) = b× log (Smn). Adjacency functions were obtained by approximate scale-free topology criterion. Then, we converted the adjacency matrix to the topology matrix. (2) Module detection: the modules with a minimum of 30 lncRNA/genes were identified using Dynamic Tree Cut method and Static Tree Cut method.
Identification of DELs and DEMs
According to our filtering criteria FC > 1.5 and P < 0.05, a total of 254 DELs, including 107 significantly up- and 147 down-regulated lncRNAs; 472 DEMs, including 308 significantly up- and 164 down-regulated mRNAs were screened. Hierarchical clustering was conducted to evaluate the altered lncRNAs and mRNAs expression pattern among samples (Fig. 1).
GO and KEGG analyses
GO and KEGG pathway enrichment analyses were performed to determine the functions of the identified DEMs. GO analysis revealed that a total of 1117 GO terms were regulated by the up-regulated mRNAs, while the down-regulated mRNAs were enriched in 376 GO terms. The top 30 GO terms by the up- and down-regulated mRNAs were displayed in Fig. 2. The up GO-net and down GO-net were shown in Fig. 3. Pathway analysis demonstrated that 128 and 66 pathways were regulated by the up-regulated and down-regulated mRNAs, respectively. The top 30 significant pathways were shown in Fig. 4.
After removing isolated genes, the PPI network of significantly up- and down- regulated genes was delineated using the STRING database. As shown in Fig. 5, the PPI network contained 284 nodes and 1269 edges. Genes with a high degree of connectivity in the PPI network may be potential targets for disease . Genes in the top 10 for the highest degree of connectivity were displayed in Table 1, suggesting that these genes may play crucial roles in the origin and development of HF.
Construction of lncRNA-mRNA WGCNA
The lncRNA-mRNA co-expression network was established to investigate the association between DELs and DEMs. Via WGCNA package in R, we constructed a Cluster Dendrogram (Fig. 6a), then two weighted co-expression sub-networks were identified. The lncRNA-mRNA weighted co-expression network was constructed based on the genes with the top 30 connectivity degrees in two modules and P-value < 0.01. Among these, 24 lncRNAs and 40 mRNAs were involved in the two modules (Fig. 6b and c). The lncRNAs in the blue and turquoise modules were displayed in Tables 2 and 3, respectively.
GO and KEGG analyses of the blue and turquoise modules
Furthermore, we performed GO and KEGG pathway annotations for the blue and turquoise modules. GO analysis revealed a total of 365 and 245 enriched terms in the blue and the turquoise modules, respectively. The top 30 terms are presented in Fig. 7a and c. KEGG analysis revealed that 67 and 48 pathways were enriched in the two modules, respectively. The top 30 pathways are presented in Fig. 7b and d.
Hepatic fibrosis, a reversible lesion, is a common progresses for acute or chronic liver diseases to liver cirrhosis, which is irreversible [21,22,23,24]. Therefore, it is crucial to explore the comprehensive mechanisms of HF to reverse its progress promptly. Over the past several years, as novel regulators in the cellular and biological processes, lncRNAs have attracted much close attention [25,26,27]. However, as previously mentioned, the number of meaningful key lncRNAs identified in HF tissues is still not sufficient.
In this study, we screened 254 DELs, which consisted of 107 up- and 147 down-regulated lncRNAs. Meanwhile, 472 DEMs were examined, which consisted of 308 up- and 164 down-regulated mRNAs. Furthermore, to investigate the functions of HF-related genes, we annotated these genes into GO and KEGG pathway analysis, we found that the DE genes were remarkably enriched in the GO terms involving TGF-β signaling pathway, ECM-receptor interaction, PPAR signaling pathway, etc. It is well known that activated hepatic stellate cells (HSCs) are famous for their role in liver fibrosis. Several studies have shown TGF-β plays a critical role in HSCs activation, and the TGF-β signaling pathway could be a potential therapeutic target for HF [28,29,30]. This classic signaling pathway is activated by TGF-β binding to its receptors located on the cell membrane. The downstream proteins, namely Smad2 and Smad3, are activated by phosphorylation, which further promotes the transcription of genes encoding ECM components, then accelerates the development of liver fibrosis . Similarly, KEGG analysis revealed that the DE genes were mainly enriched in the metabolic process, which were closely related to previous studies. For example, our previous studies indicated that dysregulations of cytochrome P450, sphingolipid, glucose and water electrolyte, fatty acid, amino acid and energy metabolism might be involved in the pathogenesis of HF in rats [32, 33].
The WGCNA is a bioinformatics technology for describing the correlation patterns among genes across microarray samples. It is a powerful systems analysis technology that focuses on the coherent function of gene network modules, which aims at identifying higher-order relationships among gene products . WGCNA analysis of lncRNAs and mRNAs has been successfully utilized to screen functionally enriched modules involved in complex diseases [35,36,37]. Via WGCNA, two lncRNA-mRNA weighted co-expression sub-networks were identified, which included 24 key lncRNAs and 40 mRNAs.
Although several LncRNAs that could affect hepatic fibrosis were already reported including lnc LFAR1 , p21 , MALAT1 , MEG3  and so on. However, the functional characterization of the 24 key lncRNAs is still in its infancy. To infer potential function of LncRNAs, according to ceRNA theory, we studied the related to mRNAs, whose functions have been annotated. For example, lnc Gm38888, NONMMUT006555, NONMMUT074618 were screened out and could interact with Col6a1 in the blue module. It is worth noting that Col6a1, the main component of HF identified in our study, is a type VI collagen gene which has been widely reported to be closely associated with HF. Accumulation of type VI collagen may lead to the distorted architecture and functional damage to the liver in HF [41, 42]. Moreover, KEGG analysis showed that Col6a1 is enriched in ECM-receptor interaction, which has been reported to play a key role in HF, as discussed above.
For the turquoise module, lnc Gm39636, Gm26618, NONMMUT015498, etc., which interact with Itga8. In particular, Itga8 is a specific cell surface marker of mesothelial cell (MC)-derived HSCs and MCs. Itga8+ (MCs) is the maintain phenotype of hepatoblasts in liver development and myofibroblasts increase the expression of Itga8 during liver injury . Recently research suggested that MCs played important roles in liver development, fibrosis, and regeneration . KEGG analysis showed Itga8 is enriched in the pathways of ECM-receptor interaction, FA and PI3K-Akt signaling pathway, which have been demonstrated to be closely related with HF.
In summary, HF was a complicated process with some lncRNAs, mRNAs involved, which were the key genes in contributing to pathogenesis of HF. This study revealed crucial information on the molecular mechanisms of HF and laid a foundation for subsequent gene validation and functional studies, which could contribute to the development of novel diagnostic markers and provide new therapeutic targets for the clinical treatment of HF. However, there were still some limitations in this study. Firstly, the key genes need to be verified in liver tissue specimens of liver fibrosis by RT-PCR. Secondly, further experiments are required to validate their effects and mechanisms in HF. The last but not least, the homology of the key genes needs to be validated for using as potential biomarkers and therapeutic targets in clinical applications.
Availability of data and materials
The datasets used and/or analyzed during the present study are available from the corresponding author on reasonable request.
Li XQ, Ren ZX, Li K, et al. Key anti-fibrosis associated long noncoding RNAs identified in human hepatic stellate cell via transcriptome sequencing analysis. Int J Mol Sci. 2018;19:675.
Molokanova O, Schonig K, Weng SY, et al. Inducible knockdown of procollagen I protects mice from liver fibrosis and leads to dysregulated matrix genes and attenuated inflammation. Matrix Biol. 2018;66:34–49.
Mu M, Zuo S, Wu RM, et al. Ferulic acid attenuates liver fibrosis and hepatic stellate cell activation via inhibition of TGF-beta/Smad signaling pathway. Drug Des Devel Ther. 2018;12:4107–15.
Zhou L, Liu S, Han M, et al. miR-185 inhibits fibrogenic activation of hepatic stellate cells and prevents liver fibrosis. Mol Ther Nucleic Acids. 2018;10:91–102.
He Y, Jin L, Wang J, Yan Z, Chen T, Zhao Y. Mechanisms of fibrosis in acute liver failure. Liver Int. 2015;35:1877–85.
Ge S, Xiong Y, Wu X, et al. Role of growth factor receptor-bound 2 in CCl4-induced hepatic fibrosis. Biomed Pharmacother. 2017;92:942–51.
Ye N, Rao S, Du T, et al. Intergenic variants may predispose to major depression disorder through regulation of long non-coding RNA expression. Gene. 2017;601:21–6.
Szczesniak MW, Makalowska I. lncRNA-RNA interactions across the human transcriptome. PLoS ONE. 2016;11:e0150353.
Andergassen D, Muckenhuber M, Bammer PC, et al. The Airn lncRNA does not require any DNA elements within its locus to silence distant imprinted genes. PLoS Genet. 2019;15:e1008268.
Szafranski P, Dharmadhikari AV, Brosens E, et al. Small noncoding differentially methylated copy-number variants, including lncRNA genes, cause a lethal lung developmental disorder. Genome Res. 2013;23:23–33.
Li F, Huang C, Li Q, Wu X. Construction and comprehensive analysis for dysregulated long non-coding RNA (lncRNA)-associated competing endogenous RNA (ceRNA) network in gastric cancer. Med Sci Monit. 2018;24:37–49.
Li M, Xie Z, Cai Z, et al. lncRNA-mRNA expression profiles and functional networks of mesenchymal stromal cells involved in monocyte regulation. Stem Cell Res Ther. 2019;10:207.
Liu XD, Xie DF, Wang YL, Guan H, Huang RX, Zhou PK. Integrated analysis of lncRNA-mRNA co-expression networks in the alpha-particle induced carcinogenesis of human branchial epithelial cells. Int J Radiat Biol. 2019;95:144–55.
Yang S, Ning Q, Zhang G, Sun H, Wang Z, Li Y. Construction of differential mRNA-lncRNA crosstalk networks based on ceRNA hypothesis uncover key roles of lncRNAs implicated in esophageal squamous cell carcinoma. Oncotarget. 2016;7:85728–40.
Zhang K, Han X, Zhang Z, et al. The liver-enriched lnc-LFAR1 promotes liver fibrosis by activating TGFbeta and Notch pathways. Nat Commun. 2017;8:144.
Gao JR, Qin XJ, Jiang H, Gao YC, Guo MF, Jiang NN. Potential role of lncRNAs in contributing to pathogenesis of chronic glomerulonephritis based on microarray data. Gene. 2018;643:46–54.
Liao SG, Lin Y, Kang DD, et al. Missing value imputation in high-dimensional phenomic data: imputable or not, and how? BMC Bioinform. 2014;15:346.
Heng L, Jia Z, Bai J, et al. Molecular characterization of metastatic osteosarcoma: differentially expressed genes, transcription factors and microRNAs. Mol Med Rep. 2017;15:2829–36.
Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinform. 2008;9:559.
Hsin KY, Ghosh S, Kitano H. Combining machine learning systems and multiple docking simulation packages to improve docking prediction reliability for network pharmacology. PLoS ONE. 2013;8:e83922.
Su TH, Kao JH, Liu CJ. Molecular mechanism and treatment of viral hepatitis-related liver fibrosis. Int J Mol Sci. 2014;15:10578–604.
Seki E, Schwabe RF. Hepatic inflammation and fibrosis: functional links and key pathways. Hepatology. 2015;61:1066–79.
Wang Y, Ma J, Chen L, Xie XL, Jiang H. Inhibition of focal adhesion kinase on hepatic stellate-cell adhesion and migration. Am J Med Sci. 2017;353:41–8.
Poordad FF. Presentation and complications associated with cirrhosis of the liver. Curr Med Res Opin. 2015;31:925–37.
Wang P, Xue Y, Han Y, et al. The STAT3-binding long noncoding RNA lnc-DC controls human dendritic cell differentiation. Science. 2014;344:310–3.
Chen X, Chen Z, Yu S, et al. Long noncoding RNA LINC01234 functions as a competing endogenous RNA to regulate CBFB expression by sponging miR-204-5p in gastric cancer. Clin Cancer Res. 2018;24:2002–14.
Krzyzanowski PM, Muro EM, Andrade-Navarro MA. Computational approaches to discovering noncoding RNA. Wiley Interdiscipl Rev. 2012;3:567–79.
Chen N, Geng Q, Zheng J, He S, Huo X, Sun X. Suppression of the TGF-beta/Smad signaling pathway and inhibition of hepatic stellate cell proliferation play a role in the hepatoprotective effects of curcumin against alcohol-induced hepatic fibrosis. Int J Mol Med. 2014;34:1110–6.
Wang K, Tang Y, Yan F, Zhu J, Li J. Potent inhibition of TGF-beta signaling pathway regulator Abl: potential therapeutics for hepatic fibrosis. J Recept Signal Transduct Res. 2015;35:410–9.
Xu A, Li Y, Zhao W, et al. PHP14 regulates hepatic stellate cells migration in liver fibrosis via mediating TGF-beta1 signaling to PI3Kgamma/AKT/Rac1 pathway. J Mol Med (Berl). 2018;96:119–33.
Zhou L, Dong X, Wang L, et al. Casticin attenuates liver fibrosis and hepatic stellate cell activation by blocking TGF-beta/Smad signaling pathway. Oncotarget. 2017;8:56267–80.
Jiang H. Song J-m, Gao P-f, Qin X-j, Xu S-z and Zhang J-f: metabolic characterization of the early stage of hepatic fibrosis in rat using GC-TOF/MS and multivariate data analyses. Biomed Chromatogr. 2017;31:e3899.
Jiang H, Qin XJ, Li WP, Ma R, Wang T, Li ZQ. Effects of Shu Gan Jian Pi formula on rats with carbon tetrachloride induced liver fibrosis using serum metabonomics based on gas chromatography time of flight mass spectrometry. Mol Med Rep. 2017;16:3901–9.
Zhao W, Langfelder P, Fuller T, Dong J, Li A, Hovarth S. Weighted gene coexpression network analysis: state of the art. J Biopharm Stat. 2010;20:281–300.
Chen Y, Ni H, Zhao Y, et al. Potential role of lncRNAs in contributing to pathogenesis of intervertebral disc degeneration based on microarray data. Med Sci Monit. 2015;21:3449–58.
Gao JR, Qin XJ, Jiang H, Gao YC, Guo MF, Jiang NN. Potential role of lncRNAs in contributing to pathogenesis of chronic glomerulonephritis based on microarray data. Gene. 2017;643:46–54.
Zhang H, Guo L, Zhang Z, et al. Co-expression network analysis identified gene signatures in osteosarcoma as a predictive tool for lung metastasis and survival. J Cancer. 2019;10:3706–16.
Yu F, Lu Z, Chen B, Dong P, Zheng J. Identification of a novel lincRNA-p21-miR-181b-PTEN signaling cascade in liver fibrosis. Mediat Inflamm. 2016;2016:9856538.
Leti F, Legendre C, Still CD, et al. Altered expression of MALAT1 lncRNA in nonalcoholic steatohepatitis fibrosis regulates CXCL5 in hepatic stellate cells. Transl Res. 2017;190(25–39):e21.
Chen MJ, Wang XG, Sun ZX, Liu XC. Diagnostic value of LncRNA-MEG3 as a serum biomarker in patients with hepatitis B complicated with liver fibrosis. Eur Rev Med Pharmacol Sci. 2019;23:4360–7.
Veidal SS, Karsdal MA, Vassiliadis E, et al. MMP mediated degradation of type VI collagen is highly associated with liver fibrosis–identification and validation of a novel biochemical marker assay. PLoS ONE. 2011;6:e24753.
Takahara T, Sollberg S, Muona P, Uitto J. Type VI collagen gene expression in experimental liver fibrosis: quantitation and spatial distribution of mRNAs, and immunodetection of the protein. Liver. 1995;15:78–86.
Ogawa T, Li Y, Lua I, Hartner A, Asahina K. Isolation of a unique hepatic stellate cell population expressing integrin alpha8 from embryonic mouse livers. Dev Dyn. 2018;247(6):867–81.
Lua I, Asahina K. The role of mesothelial cells in liver development, injury, and regeneration. Gut Liver. 2016;10:166–76.
We are grateful to Mr. Qiang Fan (Ao Ji Bio-tech Co. Ltd. Shanghai, China) for providing help with data analysis.
This study was financially supported by the National Natural Science Foundation of China (Grant No. 81973648).
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Wu, F., Ning, L., Zhou, R. et al. Screening and evaluation of key genes in contributing to pathogenesis of hepatic fibrosis based on microarray data. Eur J Med Res 25, 43 (2020). https://doi.org/10.1186/s40001-020-00443-0
- Hepatic fibrosis
- Key genes
- Functional enrichment analysis
- Protein–protein interaction