- Academic Editor
†These authors contributed equally.
Background: Uterine corpus endometrial carcinoma (UCEC) is among the most common malignant tumors affecting women’s reproductive systems. Patients’ postoperative survival results differ greatly because of the significant heterogeneity of UCEC. The activity of mitochondria in UCEC and normal endometrium was shown to be substantially different. The objective of this research was the creation of better tools for predicting UCEC patient survival to provide more accurate and effective treatment strategies. Methods: The UCEC RNA sequencing data was accessed at the Cancer Genome Atlas project, containing 539 UCEC samples and 35 tumor-adjacent tissue. The differentially expressed genes (DEGs) were identified through the R package ‘limma’. The mitochondrial protein genes were subjected to a Cox regression analysis using the absolute shrinkage and selection operator (LASSO). The differences (variations) in the biological processes between the patient groups were examined through gene set variation analysis (GSVA). Results: Results of gene set enrichment analysis (GSEA) analysis revealed that mitochondria-related pathways were more active in endometrial cancer than in tumor-adjacent tissue. Through the screening of LASSO-cox and multi-cox analysis, we obtained 14 mitochondrial protein genes (CKMT1B, CYP27A1, GPX1, GPX4, GRPEL2, HPDL, MALSU1, MRPS5, NDUFC1, OPA3, OXSM, POLRMT, SAMM50, TOMM40L) related to patient prognosis. Based on the expression levels of these 14 genes in each patient, we developed a new scoring algorithm. Compared with the traditional TNM classification system, the algorithm has better accuracy in predicting patient prognosis. Moreover, a nomogram was constructed through the combination of the scoring algorithm and the patient’s clinical features. Conclusions: The scoring algorithm based on mitochondrial gene expression can assist clinicians in predicting the postoperative survival rate of patients, allowing them to devise more precise treatment programs.
Uterine corpus endometrial carcinoma (UCEC) is considered to be among the most common malignant tumors affecting women’s reproductive systems [1]. The last few years have seen a rise in the frequency of the disease, which has been linked to alterations in the lifestyle and living environment of the people as well as the irregular use of hormones [2]. The therapeutic options currently being considered for UCEC include chemo-, hormone- and radio-therapies as well as surgery and targeted therapy [3]. The tumor heterogeneity of UCEC is extensive and specific subtypes, such as recurrent and endometrial serous cancer, depict a poor prognosis of patients [4]. If specific treatments are targeted at the patients, it will effectively improve their quality of life and survival time. The premise of precise postoperative treatment for patients is an effective tool for prognosis prediction. At present, there is still a lack of UCEC prognostic prediction tools in clinical practice.
Matter-energy conversion occurs predominantly in mitochondria. Therefore, malignant transformation causes remarkable changes in the kinetics, metabolic mode, and transport mode along with the response of the organelle to oxidative stress [5]. In cancer cells, the intensity of glucose metabolism is greatly enhanced, resulting in the production of more intermediate metabolites [6]. Due to the increased energy demand of cancer cells, the mitochondrial oxidative phosphorylation of (OXPHOS) is also significantly improved [7]. The increased mitochondrial metabolism in cancer cells creates an overabundance of reactive oxygen species (ROS), which kills normal cells and promotes tumor development [8]. Mitochondrial functional abnormality is considered to be one of the key contributing factors to the onset and progression of cancer, hence, the tumor’s progression and malignancy can be assessed through this aspect.
The RNA expression profiles of UCEC and tumor-adjacent endometrium were compared in this study. The mitochondrial localization proteins were assessed, and the data indicated that the expression level of the genes encoding these proteins was considerably elevated in UCEC mitochondrial metabolic activity. In addition, patient prognosis was found to be associated with a large number of mitochondrial protein genes. This demonstrates that using the expression levels of mitochondrial protein genes, a new prognostic prediction algorithm can be developed. The purpose of this study is to help doctors provide precise and personalized medical care for postoperative UCEC patients.
The UCEC RNA sequencing (RNA-seq) dataset utilized in this research was accessed from the Cancer Genome Atlas project (TCGA) [9]. Data without clinical stage and prognostic follow-up were excluded. As a result, a total of 539 UCEC tissues and 35 tumor-adjacent tissues RNA-seq data and corresponding clinical information were included in the study. The GDC data transfer tool was utilized to download the UCEC samples’ normalized expression profile data in the Fragments per Kilobase per Million (FPKM) format. The data were then summarized into an expression matrix. The patient’s general data was depicted in Table 1.
Clinical characteristics | Value | |
---|---|---|
Total case number | 539 | |
Median age (range) | 64 (31–90) | |
Tumor stage (%) | ||
I | 336 (62.3) | |
II | 51 (9.5) | |
III | 123 (22.8) | |
IV | 29 (5.4) | |
Pathological grade (%) | ||
G1 | 97 (18) | |
G2 | 110 (22.4) | |
G3 | 311 (57.7) | |
Undefined | 21 (3.9) | |
Histology (%) | ||
Endometrioid | 403 (74.7) | |
Mixed serous and endometrioid | 22 (4.1) | |
Serous | 114 (21.2) |
The samples of both UCEC and the control group were examined with the aid of the R package ‘limma’ to identify differentially expressed genes (DEGs). By employing the R package ‘clusterProfiler’ [10], gene set enrichment analysis was performed (GSEA). For this study, the five pathways that depicted a close mitochondrial association were chosen. The molecular signature database (MSigDB) was utilized to retrieve the hallmark pathway gene sets including REACTIVE OXYGEN SPECIES, HYPOXIA, GLYCOLYSIS, FATTY ACID METABOLISM, and OXIDATIVE PHOSPHORYLATION, as well as the MITOCHONDRIAL PART pathway [11]. Mitochondrial protein genes were obtained from MitoCarta3.0 web servers (https://www.broadinstitute.org/mitocarta) [12] which contain data concerning 1136 human genes encoding mitochondrial localization proteins
The mitochondrial protein genes were subjected to a Cox regression analysis using the absolute shrinkage and selection operator (LASSO). The genes not linked to prognosis were filtered out. The creation of the predictive risk score formula involved the LASSO analysis selected gene that was associated with the smallest penalty parameter. Afterward, for an in-depth target gene screening, the multivariate Cox regression model was applied. The Kaplan-Meier method and the log-rank test were conducted to assess the rate of patient survival and to determine its statistical significance, respectively. The risk score was examined for its function as an independent prognostic factor by analyzing the risk score-based prediction model by means of univariate and multivariate Cox regression analyses. Furthermore, the ‘survivalROC’ package was utilized by employing the time-dependent receiver operating characteristic curve (TDROC) to examine the risk score’s prediction ability over 1, 3, and 5 years [13].
Biological processes are analyzed predominantly through GSVA [14]. The MSigDB was utilized to retrieve the gene set file ‘c2.cp.kegg.v7.3.symbols.gm’, and the R (version 4.1.2, R Foundation for Statistical Computing, Vienna, Austria) ‘GSVA’ program was also employed for this analysis. FDR 0.05 was chosen as the significance criterion.
The unpaired Student’s t-test or one-way analysis of variance (ANOVA) test was employed for the comparison of data conforming to a normal distribution, whereas data that did not conform to normal distribution was subjected to the Mann-Whitney U test or the Kruskal-Wallis test for comparison. The construction of the predictive nomogram was carried out based on the R package ‘rms’ and ‘Iasonos’ guide [15]. The data were visualized and subjected to statistical tests employing GraphPad Prism 6.0 (GraphPad Software Inc., La Jolla, CA, USA).
In order to analyze whether the activity of mitochondria in endometrial cancer
is different from that of control endometrium, a GSEA pathway analysis was
conducted. The TCGA-UCEC mRNA sequencing data were obtained and the five
biological pathways most relevant to mitochondrial activity were screened using
MSigDB. The five pathways included
mitochondrial proteins, fatty acid metabolism, the glycolytic pathway, the
reactive oxygen pathway, and oxidative phosphorylation. As shown in Fig. 1,
mitochondrial protein GLYCOLYSIS and oxidative phosphorylation pathways are
significantly upregulated in endometrial cancer (adjusted p-value
Endometrial cancer and tumor-adjacent controls were examined for variations in mitochondrial activity through GSEA. Five mitochondrial activity-associated gene sets were assessed. Depending on whether the data indicated a curve above the enrichment score of 0 points or below 0 points, the activation of the gene in either endometrial cancer or the control uterine epithelium was detected, respectively. UCEC, Uterine corpus endometrial carcinoma; NES, normalized enrichment score; GSEA, gene set enrichment analysis; p.adjust, adjusted p-value.
LASSO Cox regression analysis was used to reduce the dimensions of 1136 mitochondrial protein genes from the ‘MITOCHONDRIAL PART’ gene set. Fig. 2a depicts the convergence of the regression coefficients. A random sampling method utilizing ten-fold cross-validation depicted that the model made up of twenty-three genes performed the best. Then, for further screening, twenty-three genes were incorporated into a multivariate Cox regression model. The data indicated fourteen genes that were most significant to the patient’s prognosis, as shown in Fig. 2b.
Prognosis-related mitochondrial protein gene screening. (a) The figures above and below depict the convergence of the lasso Cox regression coefficients and the coefficient profile plot of log(lambda) in the LASSO model, respectively. (b) The results of the multivariate Cox regression model showed the hazard ratio and the p-value of fourteen genes.
A gene signature based on fourteen mitochondrial proteins was developed according to the model correlation coefficient, which was employed for prognostic prediction of individuals with endometrial cancer.
Each patient with endometrial cancer had their risk score estimated. In order to classify patients into high- and low-risk groups, their median risk scores were utilized as the threshold. Fig. 3a depicts the distribution of hazards core and patient survival status. The two groups were examined in terms of the expression of the aforementioned genes and the data depicted in Fig. 3b. Patients with high risk depicted considerably decreased rates of survival according to the Kaplan–Meier curve, as depicted in Fig. 3c.
Construction and validation of the prognostic signature. (a) The distribution of risk scores and the varying scores of patients and their associated survival status. (b) Prognostic signature genes’ expression profiles are depicted in the form of a Heatmap. (c) Patients’ overall survival in the two risk groups, as per the Kaplan-Meier curves.
The data were assessed by univariate and multivariate Cox regression analyses.
As depicted in Fig. 4a, the risk score was indicated as a strong independent risk
factor (p
Verification of the independent prognostic value of the signature risk score. (a) The risk score and clinical factors were subjected to univariate and multivariate Cox regression analyses, and the data were depicted in a forest plot. (b) The 1-, 3-, and 5-years’ time-dependent receiver operating characteristic curves. ROC, receiver operating characteristic; AUC, area under curve.
Enrichment analysis of patients with varying risk scores was carried out through GSVA for the purpose of assessing their characteristic biological behaviors. The ‘c2.cp.kegg.v7.2.symbols.gmt’ pathway collections were accessed at MsigDB. All pathways were analyzed, and the ones that were statistically significant were depicted in Fig. 5. A biological pathway and a patient’s score are represented by each row and column, respectively. The patients were ranked from left to right as per their risk scores from low to high. Red represents the upregulation of a pathway, while blue represents downregulation. From the results, it can be seen that the cell proliferation-related pathways of patients with high risk scores (high mitochondrial activity) were more active than those with low risk scores and the control group. In contrast, cytokine-cytokine receptor interactions, lipid metabolism pathways, and WNT signaling pathways are upregulated in low-risk patients. These findings need further in vivo and in vitro verification and may become new research directions in the treatment of endometrial cancer.
Heatmap of biological process-related activation status. The patients with different risk scores were analyzed using GSVA to determine the biological process-related activation status.
The integrated risk score and clinical prognostic variables were utilized in developing a nomogram for the prediction of the survival rate of patients at 3-, 5-, and 10- years to increase prognosis accuracy and make practical application easier as depicted in Fig. 6a. In practice, the factor score of each patient’s line must be determined and the summation of all the contribution scores of the individual factors can be utilized to calculate the patient’s prognosis. The accuracy of this nomogram was then assessed. The calibration charts for three and five years indicated that the nomogram outperforms an ideal model as per Fig. 6b. According to the decision curve analysis shown in Fig. 6c, the clinical significance of our nomogram significantly exceeded that of the clinical characteristics. The use of risk scores in combination with clinical factors was proven to be much more beneficial in estimating prognosis, thereby benefitting more patients.
The prognostic nomogram establishment and validation. (a) A predictive nomogram for anticipating the probability of patients with endometrial cancer surviving for 3-, 5-, and 10- years. (b) Plots depicting the calibration of the risk score-based nomogram in terms of the consistency between observed and predicted 3- and 5-year outcomes. (c) The 3- and 5-years’ risk assessment by analyzing the nomogram using decision curve analyses.
Globally, among the malignancies affecting the reproductive system of women, endometrial cancer is quite prevalent. The factors that have been associated with an increased risk of developing endometrial cancer include the histological type, size, and grade of the tumor as well as the stage of the disease, metastasis of lymph nodes, and myometrial invasion [16]. It usually affects postmenopausal women, and the prognosis of patients suffering from late UCEC is very poor, requiring more in-depth investigation. Similar to other tumors, the occurrence and development of UCEC also involve complex molecular mechanisms [17].
Cancer cells have an abnormal metabolism, which is a common biological characteristic. Because mitochondria are at the heart of cell metabolism, it is important to look into the state of cancer’s mitochondria. The Warburg effect, proposed in the twentieth century, is the most well-known explanation explaining changes in mitochondrial energy metabolism in cancer. It elaborates that the majority of cancer cells get their energy through glycolysis [18]. The Warburg effect was increasingly refuted as the study progressed. It was discovered that aerobic oxidation was the primary source of energy for cancer cells and that increasing glycolysis produced more intermediate metabolites [6]. The colon cancer cell line SW620, for example, showed a higher OXPHOS but lower glycolysis [7]. Furthermore, increased glycolysis may not always lead to cancer growth. AIF deletion boosted glycolysis and decreased oxidative phosphorylation in lung cancer cells, but hindered cancer cell growth [19]. An increase in the amino acids and fatty acids breakdown, in addition to OXPHOS, was detected in cancerous cells [20]. The increased metabolism of cancer cell mitochondria produces an overabundance of ROS, which promotes normal cell death while enhancing tumor growth [8]. Furthermore, variation in the dynamics of cancer mitochondria was detected. Mitochondrial functions are maintained by autophagy, division, and fusion. In cancer, mitochondrial activity increased, and more mitochondria were found in a single tumor cell [21]. As a result, this research dealt with the modifications in UCEC mitochondria.
This study examined the variations in the genes associated with mitochondrial proteins and the metabolism of mitochondria in UCEC and tumor-adjacent endometrium. The fatty acid metabolism, glycolysis, OXPHOS, and ROS are all highly active in UCEC as expected. Meanwhile, the transcriptional activity of a large number of mitochondrial proteins in UCEC increased. Many of these were linked to the patient’s prognosis. LASSO Cox regression strongly linked the 23 mitochondrial protein genes with the prognosis of individuals with UCEC. Through multivariate COX regression, a UCEC prognostic prediction tool based on 14 genes was obtained (CKMT1B, CYP27A1, GPX1, GPX4, GRPEL2, HPDL, MALSU1, MRPS5, NDUFC1, OPA3, OXSM, POLRMT, SAMM50, and TOMM40L).
This mitochondrial gene-based scoring tool can be utilized as an independent prognostic factor in addition to patient age, stage, and grade as depicted in Fig. 4a. Its accuracy in predicting the survival rate of patients 1-year, 3-year, and 5-year after surgery is higher than other clinical features as depicted in Fig. 4b. It shows that the tool can further evaluate the survival probability of patients with the same tumor stage and grade, so it has high clinical application value.
The biological characteristics of UCEC with active mitochondria can be further elucidated through a GSVA pathway analysis as depicted in Fig. 5. The resulting data depicted considerable upregulation of pathways such as DNA REPLICATION, CELL CYCLE, BASAL TRANSCRIPTION FACTORS, etc. in patients with increased mitochondrial scores. It confirms the statement that the higher the mitochondrial activity of UCEC, the stronger the tumor proliferation ability.
To increase prognostic accuracy and facilitate practical application, a nomogram was developed. A common technique for determining the prognosis of cancer is the nomogram, which integrates patients’ parameters for prognostic prediction by means of statistical approaches. Considering a combination of factors, a nomogram’s accuracy is higher than that of a single clinical feature of patient [15, 22]. To assess the nomogram, a calibration chart analysis and decision curve analysis were done. The results showed that the nomogram has greater prediction accuracy and can benefit more patients than utilizing a single factor to estimate patient prognosis.
This research still has few limitations: (1) The transcriptome data utilized in the model was obtained through sequencing. Using microarray and quantitative polymerase chain reaction, it is vital to verify the accuracy of the expression data. (2) The appropriate cut-off value sought must be calculated because this study employs gene expression data as categorical variables to be input into Cox regression. (3) This is a retrospective study with a heterogeneous study population, so the results may be biased. The conclusions from this study and the effectiveness of the developed tool need to be confirmed by future clinical studies.
For the first time, the UCEC prognostic prediction tool and nomogram based on the mitochondrial localization gene were developed as part of this research. This model is more representative than general metabolic models since mitochondria are engaged in most metabolic pathways. Furthermore, it identified additional genes that have not been explored in the field of UCEC but are linked to patient prognosis. These genes serve as a starting point for more research into the mechanism of UCEC. It is worth noting that, the findings of this study can assist clinicians in giving postoperative UCEC patients precise and individualized medical care.
In conclusion, the mitochondrial protein gene-based scoring algorithm proposed in this study is a valuable tool for predicting UCEC patient survival. It can also aid therapeutic chemotherapy by evaluating the metabolic status of tumors. However, more clinical trials are needed to corroborate the findings of this research.
The raw data generated in this study are available upon reasonable request from the corresponding author.
WL designed the research study, analyzed the data, and contributed to the conception of the work. HL and DS performed the operation, collected the data, and contributed to the acquisition and interpretation of the data. MT provided valuable research ideas, contributed to the design of the project, and offered guidance throughout the study. All authors contributed to editorial changes in the manuscript. All authors have participated sufficiently in the work and agreed to be accountable for all aspects of the work. All the authors approved the final version of the manuscript.
This research focuses on the re-mining and analysis of public databases. The RNA sequencing data and clinical information used in the study were obtained from TCGA-UCEC. This study does not raise any ethical concerns.
We would like to thank all members of our research team for their enthusiastic engagement, as well as all participants for their excellent work. Thanks to all the peer reviewers for their opinions and suggestions.
This research received no external funding.
The authors declare no conflict of interest.
Publisher’s Note: IMR Press stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.