- Academic Editor
†These authors contributed equally.
Background: This study aims to identify biomarkers through the analysis
of genomic data, with the goal of understanding the potential immune mechanisms
underpinning the association between sleep deprivation (SD) and the progression
of COVID-19. Methods: Datasets derived from the Gene Expression Omnibus
(GEO) were employed, in conjunction with a differential gene expression analysis,
and several machine learning methodologies, including models of Random Forest,
Support Vector Machine, and Least Absolute Shrinkage and Selection Operator (LASSO) regression. The molecular underpinnings of the
identified biomarkers were further elucidated through Gene Set Enrichment
Analysis (GSEA) and AUCell scoring. Results: In the research, 41 shared
differentially expressed genes (DEGs) were identified, these were associated with
the severity of COVID-19 and SD. Utilizing LASSO and SVM-RFE, nine optimal
feature genes were selected, four of which demonstrated high diagnostic potential
for severe COVID-19. The gene CD160, exhibiting the highest diagnostic value, was
linked to CD8
The COVID-19 pandemic represents an unparalleled worldwide health emergency,
profoundly impacting populations worldwide with an alarming surge in infections
and fatalities (https://coronavirus.jhu.edu/map.html). This crisis presents
monumental challenges to healthcare systems, economies, and social structures.
While most individuals experience mild symptoms, approximately 20% develop
severe symptoms [1, 2, 3]. Severe COVID-19 is typically characterized by severe
respiratory dis-tress, multi-organ failure, acute respiratory distress syndrome
(ARDS), and pneumonia. Among them, ARDS emerges as one of the prevailing and
consequential outcomes in severe cases of COVID-19 infection, leading to damaged
alveoli, fluid accumulation in the lungs, and impaired gas exchange, resulting in
severe respiratory distress and hypoxemia. Severe COVID-19 can also lead to
multi-organ failure, particularly affecting vital organs such as the kidneys and
heart, which can be life-threatening [4]. Although the precise mechanisms
underlying the progression of COVID-19 are yet to be completely comprehended,
traditional risk factors such as older age (
In modern society, short sleep duration and sleep deprivation (SD) have become common trends. With
extended working hours, the quality of sleep has declined, becoming a global
health issue. Extensive evidence suggests that inadequate sleep (less than 6
hours per night) and chronic sleep deprivation are closely associated with
chronic diseases, viral infections, overall health status, and mortality rates
[9, 10, 11, 12, 13, 14, 15]. Research indicates that people who experience subpar sleep quality have
an increased vulnerability to SARS-CoV-2 infection than those who enjoy superior
sleep quality [16, 17, 18, 19, 20]. Obstructive sleep apnea (OSA), the predominant
sleep-associated respiratory condition, results in recurrent arousals and ensuing
sleep deficiency. Numerous studies have demonstrated the association between OSA
and adverse outcomes of COVID-19, particularly with ICU admission, mechanical
ventilation, and mortality rates [21, 22, 23]. SD elevates the likelihood of
experiencing severe COVID-19, and may result in endocrine disruption, excessive
activation of inflammatory cytokines, and immune system imbalance [18, 24]. This
process exacerbates the dysregulation of the hypothalamic-pituitary-adrenal (HPA)
axis, subsequently triggering an elevation in cortisol secretion, which in turn
impairs immune function, culminating in a diminished immune response [25]. Within
the immune system, CD8
In the field of bioinformatics, gene microarray and RNA sequencing (RNA-seq) are two important biotechnologies used for studying gene expression. Each of them has its own advantages and disadvantages. Gene microarray technology has matured over the years and has a wealth of tools and algorithms for processing and analyzing microarray data. On the other hand, RNA sequencing is highly sensitive and can detect all transcripts, including newly discovered transcripts, genes, and non-coding RNA. By combining these two types of data, the quality of the data and gene expression can be more accurately assessed. Genomic data are widely used to aids in pinpointing crucial genes and distinguishing signal cascades implicated in the progression of COVID-19. This approach facilitates a more profound understanding of the cellular and molecular processes at play. The latest bioinformatics research has unveiled those genes such as PLK1, CDC6, and KIF2C, along with their associated immune pathways, could potentially serve as therapeutic targets for COVID-19 within the peripheral blood mononuclear cells (PBMCs) of subjects afflicted with SARS coronavirus 2. However, there is currently no reported analysis of gene expression data regarding the interplay between SD and the severity of COVID-19. It is worth noting that long-term SD can lead to increased levels of inflammatory activity markers and abnormal immune cell counts, which is consistent with observations in future populations at risk of developing viral diseases. Therefore, it is crucial to evaluate and determine the differences in immune cell pro-portions to reveal the potential mechanisms underlying the association between SD and the severity of COVID-19.
Our study leverages publicly available databases to obtain whole-genome data from Peripheral Blood Mononuclear Cells (PBMC), facilitating the uncovering of co-expressed differentially expressed genes (co-DEGs) in SD and COVID-19 cases. We employ LASSO and SVM-RFE machine learning techniques to identify biomarkers affiliated with severe COVID-19 diagnosis in the context of SD. Furthermore, we utilize the CIBERSORT algorithm and single-cell sequencing analysis to investigate the interrelation between these diagnostic biomarkers and the constitution of immune cells. Lastly, GSEA was utilized for GO and Kyoto Encyclopedia of Genes and Genomes (KEGG) annotations on the differential analysis results to better understand the potential immunological reactions between SD and severe COVID-19.
Utilizing “Coronavirus COVID-19” and “Sleep Deprivation” as primary keywords, we meticulously searched the Gene Expression Omnibus (GEO) and Human Cell Atlas (HCA)databases to procure relevant datasets [28, 29]. To ensure the integrity and robustness of our data, we exclusively selected high-throughput datasets featuring over 50 COVID-19 patients. The datasets incorporated into our study include GSE215865, GSE37667, and GSE213313 from the GEO database, along with the EGAD00001007959 dataset from the HCA. The datasets utilized in this study have been summarized in a table (Table 1).
Dataset | Type | Size | Platform |
GSE215865 | RNA-seq | 266 | GPL24676 |
GSE213313 | Microarray | 83 | GPL21185 |
EGAD00001007959 | CITE-seq | 228 | GPL24676 |
GSE37667 | Microarray | 18 | GPL570 |
Gene symbols in the GSE37667 and GSE213313 datasets were converted from probes according to the probe annotation files in each dataset. Gene symbols in the GSE215865 dataset were converted using the gene annotation file for GRCh38 (Human). Subsequently, we employed the “limma” package in R for normalizing the expression matrix, thereby generating a Normalized gene expression matrix [30]. The workflow diagram of this study is illustrated in a schematic diagram (Fig. 1).
The workflow diagram of this study was adopted for the identification and subsequent validation of diagnostic biomarkers specific to severe COVID-19 related to SD. DEG, Differentially Expressed Genes; KEGG, Kyoto Encyclopedia of Genes and Genomes; GO, Gene Ontology; GSEA, Gene Set Enrichment Analysis; SD, sleep deprivation.
In our research, we leveraged the “limma” package, effective for large
datasets and suitable for chip and RNA-seq data, to detect DEGs in severe
COVID-19 and SD samples. Recognizing the nuanced gene expression variations in
sleep deprivation, we defined significant DEGs with a p-value
The study seeks to identify crucial diagnostic biomarkers that distinguish between non-severe and severe COVID-19 patients. The “glmnet” and “e1071” packages are used to execute Least Absolute Shrinkage and Selection Operator (LASSO) regression analysis and Support Vector Machine Recursive Feature Elimination (SVM-RFE) analysis. LASSO regression reduces prediction errors through k-fold cross-validation, pushing certain regression coefficients to zero and including only non-zero coefficients in the final model. SVM-RFE is a sequential backward selection algorithm that scores each feature, removes the lowest scoring feature, and retrains the model in each iteration, ultimately selecting the necessary number of features. The biomarkers identified by both algorithms were visualized with Venn diagrams.
We conducted Receiver Operating Characteristic (ROC) curve analysis on all datasets using the “pROC” package and displayed the results by the same package to evaluate the accuracy and diagnostic capability of the biomarkers [33].
Cell type scores for each sample in the GSE215865 dataset were sourced from the Mount Sinai COVID-19 Biobank (https://www.synapse.org/). These cell type scores were computed using Transcripts Per Million (TPM) as input, in accordance with the procedures suggested by CIBERSORT, and measurements from all technical replicates were amalgamated when calculating batch control sample TPMs. The reference signature matrix LM22 employed contains comprehensive RNA-seq data from PBMCs.
In R, we employed the CIBERSORT package (version 0.1.0) and conducted 1000 permutations to ascertain the relative enrichment level of specific immune cell populations in each sample within the GSE213313 and GSE37667 datasets [34]. This was achieved by referencing immune cells in the LM22 gene signature to estimate the relative abundance of 22 lymphocyte subtypes in each sample. To compare the proportion differences of immune cells in samples from different groups, we performed a Wilcoxon test on the abundance of 22 immune cells across various sample groups.
The cell annotation table, which includes quality control metrics and cell type in-formation, was gathered from the original publication. We excluded cells with low-complexity libraries (cells where transcripts aligned with fewer than 200 genes), cells that are likely dead or apoptotic (with over 15% of transcripts coming from mitochondria), and cells with high-complexity libraries (cells where transcripts aligned with more than 6500 genes). Through this rigorous filtering process, we ensured the high quality of the selected cells, ultimately obtaining 68,395 cells for subsequent analysis.
After eliminating mitochondrial and ribosomal genes that could interfere with
cell clustering analysis, we used the Python library SCANPY to screen for 2000
highly variable genes (HVGs) [35], which will be used for further clustering
analysis. Subsequently, we employed scvi-tools (single-cell variational inference
tools) to create a Variational Autoencoder (VAE) model instance targeted at all
CD8
We leveraged this pre-established scVI model as a robust initial starting point to expedite the training of subsequent SCANVI models. This enabled us to carry out more detailed subclustering analysis to identify differing transcriptional states within major cell types.
Fig. 1 depicts the comprehensive data processing workflow utilized in our study. We employed the voom-limma process to identify DEGs between the non-severe and severe COVID-19 cohorts. Additionally, in the SD dataset, we also performed screening between healthy individuals and SD patients. Volcano plots and heatmaps were used to visually demonstrate the distribution of differences (Fig. 2A,B and Supplementary Table 1). From the GSE215865 dataset of COVID-19 samples, we identified 3313 upregulated DEGs. In the GSE37667 dataset of SD samples, we identified 34 upregulated DEGs (Fig. 2C). Furthermore, we identified 3270 downregulated DEGs from the GSE215865 dataset of COVID-19 samples and 68 downregulated DEGs from the GSE37667 dataset of SD samples (Fig. 2C).
This interpretation focuses on DEGs related to severe COVID-19 and SD. (A) The GSE215865 dataset, visualized via volcano plot and heatmap, displays 6583 DEGs from COVID-19 Peripheral Blood Mononuclear Cells (PBMC) samples, with 3313 up-regulated and 3270 down-regulated genes. (B) The GSE37667 dataset, likewise presented, unveils 102 DEGs in SD PBMC samples, comprising 34 up-regulated and 68 down-regulated genes. (C) A Venn diagram reveals 10 concurrently up-regulated and 31 mutually down-regulated genes in both datasets, indicating molecular correlation between the conditions.
Our research identified 41 common DEGs by intersecting sets of upregulated and downregulated genes. To understand the biological roles and characteristics of these DEGs, we conducted GO analysis and KEGG pathway enrichment analysis. The GO analysis indicated that these genes are primarily involved in biological processes like “leukocyte-mediated immune response”, are predominantly localized to the “cytoplasmic vesicle lumen”, and are enriched in the molecular function of “carbohydrate binding” (Fig. 3A). KEGG pathway analysis suggested that these DEGs are associated with pathways such as “NK cell-mediated cytotoxicity” (Fig. 3B). The output from GeneMANIA includes the functions of related core genes and their interactions, all of which are associated with specific aspects of the immune system, particularly lymphocyte-mediated immune responses and cellular cytotoxicity (Fig. 3C). Furthermore, they are all closely related to the CD160 gene.
SD-related severe COVID-19 for functional enrichment analysis. (A) Shared targets are analyzed via Gene Ontology (GO). (B) Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment evaluates these common targets. (C) The analysis of GeneMANIA results revealed the functions and interactions of core genes.
To evaluate the potential of differentially expressed genes (DEGs) as diagnostic biomarkers between the severe and non-severe COVID-19 cohorts, we employed two different machine learning strategies, LASSO and SVM-RFE, using the GSE215865 dataset. Firstly, we applied the LASSO logistic regression algorithm to the 41 commonly identified DEGs after parameter tuning and cross-validation, resulting in the selection of 10 COVID-19-related feature genes (Fig. 4A,B). Subsequently, we employed the SVM-RFE algorithm to further screen the 41 DEGs, ultimately identifying 25 genes as-sociated with COVID-19 (Fig. 4C). Through the comprehensive analysis of these two machine learning algorithms, we finally determined 9 optimal feature genes (CD160, KLRB1, LSM7, LIPT1, MYADM, QPCT, SIGLEC17P, SLC22A4, and ZNF32) (Fig. 4D). Given the high sensitivity of second-generation sequencing data, in order to build a more accurate diagnostic model in chip data, we decided to further reduce feature genes. We selected the four most significantly different feature genes (CD160, QPCT, SIGLEC17P, and SLC22A4) for model construction, and performed a differential analysis of the transcription levels of these four genes. in the validation set (Fig. 4E and Supplementary Fig. 1).
Machine learning algorithms used for gene identification. (A,B)
Coefficient profile plot of Least Absolute Shrinkage and Selection Operator (LASSO) regression and deviance plot from
cross-validation. (C) Support Vector Machine - Recursive Feature Elimination (SVM-RFE) selects and visualizes biomarkers. (D) Genes
identified by both methods. (E) Differential analysis of DEGs in the GSE213313
validation set. All gene significances marked: *p
Upon the analysis and establishment of models using LASSO regression and SVM-RFE
algorithms, we calculated the risk score for each sample based on the diagnostic
results and logCPM values: Risk Score = [(–0.68974068)
The discernment and predictive power of the SD related severe COVID-19 diagnostic biomarker model. (A) In the GSE215865 dataset, the diagnostic performance of biomarkers is exhibited. (B) The risk score model demonstrates distinguishing capabilities for severe COVID-19 in two datasets. (C) Unsupervised clustering analysis confirms the consistency of the model across the two datasets.
Utilizing the GSE215865 dataset and deconvolution of the immune cell subtype
expression matrix, our study investigates the diversity among immune cell
subtypes in COVID-19, offering a broad view of the immune response in this
context. In the severe COVID-19 group, the proportions of monocytes, resting
memory, CD8
Differences in the distribution of immune cells in severe and
non-severe cases in the GSE215865 dataset. (A) Box plot illustrates the
differential analysis of relative abundance of immune cells. (B) Heatmap depicts
the distribution of 22 types of immune cells. (C) Bar chart detailing the
abundance of these 22 types of immune cells is provided. Statistically
significant differences are denoted as follows: *p
The study confirms shared characteristics of immune cell distribution between SD
and severe COVID-19 samples using the GSE213313 dataset. It was observed that in
the severe COVID-19 group, the fractions of CD8
Differences in the distribution of immune cells in the GSE213313
dataset and GSE37667 dataset. (A,C) box plot illustrates the differential
analysis of relative abundance of immune cells. (B,D) heatmap depicts the
distribution of 22 types of immune cells. Statistically significant differences
are denoted as follows: *p
Next, this study found a positive link between CD160 and CD8
The expression level of CD160 is correlated with the enrichment
of immune cells in the GSE215865 dataset and GSE213313 dataset and GSE37667
dataset. (A,B) The GSE215865 dataset and GSE213313 dataset and GSE37667 dataset
reveals a correlation between CD160 expression levels and CD8
Further analysis aims to elucidate the potential functional mechanisms of CD160
in CD8
Molecular Landscape of CD160 in CD8
To delve deeper into the immune pathways linked with CD160 in COVID-19 samples, we conducted pathway enrichment analysis on the gene set exhibiting high CD160 expression (Supplementary Table 4 and Supplementary Table 5). From the GO biological process results, we noted a positive correlation between CD160 and the T cell receptor signaling pathway as well as ribosome biogenesis (Fig. 10A). This implies a prospective role of CD160 in the regulation of T cell signal transduction and intracellular protein synthesis processes. In the cellular component results, we found a positive correlation between CD160 and T cell receptor complex, cytoplasmic ribosome, and plasma membrane receptor complex, suggesting the potential role of CD160 in the assembly or functional regulation of these cellular structures (Fig. 10B). In the molecular function results, we discovered a positive correlation between CD160 and MHC protein complex binding and ribosomal structural constituents, further indicating the potential importance of CD160 in immune responses and protein synthesis (Fig. 10C). Additionally, in the KEGG pathway analysis results, we found a positive correlation between CD160 and IgA production, ribosome, and cell adhesion molecule path-ways, suggesting its potential involvement in immune processes and cell interactions related to these pathways (Fig. 10D). These results provide insights into the potential immune pathways of CD160 in COVID-19 samples and offer important clues for further unraveling its biological functions.
CD160 and COVID-19-related immune pathways. (A–C) Gene Set Enrichment Analysis (GSEA) analysis of Biological Process (BP), Cellular Component (CC), and Molecular Function (MF) pathways related to DEGs with high CD160 expression in the Gene Ontology (GO) database. (D) GSEA enrichment analysis of pathways associated with DEGs displaying high CD160 expression in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database.
Numerous studies have indicated the impact of viral infections on the ribosome [37]. COVID-19, an illness induced by the RNA virus SARS-CoV-2, depends on the host cell’s ribosome for protein synthesis. The ribosome is a complex composed of multiple subunits that function to translate mRNA into proteins. During the process of cellular growth and development, ribosome biogenesis plays a crucial role as the biological process responsible for generating ribosomes [38]. Therefore, we investigated the correlation between CD160 and genes involved in the ribosome and its biogenesis pathways, obtained from the KEGG database. The results depicted in pathway map show a close correlation between CD160 and genes encoding small subunit proteins (such as L2, S20e, L23Ae) and large subunit proteins (such as S5e, S23e, L30e, L7Ae) involved in protein synthesis processes and functional regulation and signaling transduction in the ribosome (Fig. 11A,B). Furthermore, another pathway map displays a significant correlation between CD160 and genes involved in the formation of pre-90s ribosome components (such as CK2A, UTP22, Rrp7), rRNA modification (such as NOP1, SUN13, DKC1, NHP2, GAR1), and splicing-related genes (such as UTP24, Rnt1, EMG1, Bms1, KRE33) during ribosome biogenesis (Fig. 11C,D).
CD160’s role in ribosome-related pathways. (A) Ribosome signaling network from KEGG. (B) Heatmap illustrating the gene expression within the ribosome pathway, stratified by levels of CD160 expression. (C) Ribosome synthesis signaling network from KEGG. (D) Heatmap illustrating the gene expression within the ribosome synthesis pathway, stratified by levels of CD160 expression.
Our study thoroughly examines the role of CD160 in COVID-19 progression, utilizing 266 samples from the GSE215865 dataset. Based on CD160’s median expression, samples were divided into two groups, leading to the identification of 3403 upregulated and 2619 downregulated genes (Fig. 12A,B and Supplementary Table 6). Further GO and KEGG analyses demonstrated a significant association between CD160 expression and T cell receptor signaling pathway regulation, implying CD160’s potential influence on the immune response to COVID-19 and its possible therapeutic value (Fig. 12C,D). The research enriches the understanding of COVID-19’s molecular mechanisms and lays a foundation for future studies.
Single-gene analysis and enrichment results of CD160. (A) Volcano plot illustrates the significantly differentially expressed genes between samples with high and low CD160 expression. (B) Heatmap displays the gene expression conditions between samples with high and low CD160 expression. (C) KEGG enrichment analysis of differentially expressed genes. (D) Gene Ontology (GO) enrichment analysis of differentially expressed genes.
Recent studies suggest that there might be interconnections between different diseases, making the exploration of these relationships a crucial area for future research [39, 40]. COVID-19, a respiratory illness caused by the SARS-CoV-2 virus, is primarily transmitted through droplets and contact [41]. Symptoms following infection include fever, cough, shortness of breath, and in severe cases, it can lead to pneumonia, respiratory failure, multi-organ damage, and even death [41, 42]. Sleep plays a crucial role in maintaining the dynamic balance of the human immune system, while SD could disrupt the function of immune cells, increasing susceptibility to diseases [19, 28, 43]. Therefore, the identification of biological markers related to SD in COVID-19, and the analysis of their association with immune cell enrichment is of great importance for improving the prognosis of COVID-19.
In this research, utilizing the GSE215865 dataset, we pinpointed 6583 DEGs
be-tween non-severe and severe COVID-19 PBMC samples. Additionally, from the
GSE37667 dataset, we identified 102 DEGs between SD and healthy control PBMC
samples. From these, we pinpointed 41 common DEGs between severe COVID-19 and SD.
Through LASSO and SVM-RFE analysis, we shortlisted CD160, SIGLEC17P, QPCT,
SLC22A4, and validated their diagnostic potential as biomarkers using ROC
analysis and predictive modeling. Applying the CIBERSORT algorithm, we discovered
a de-crease in CD8
We also performed single gene GO and KEGG analysis for CD160, revealing a strong
correlation between DEGs in low and high expression samples of CD160 and T cell
receptor signaling pathways, particularly in the regulation of T cell activation
responses. These samples were categorized based on median cut-off values.
Previous research indicates that cancer, developmental disorders, and viral
infections can affect ribosome production [40, 41, 42, 44]. COVID-19 is an
illness triggered by the SARS-CoV-2 virus, an RNA virus that depends on the
ribosomes of the host cell for its protein production [38, 45, 46]. Thus, we
investigated the relationship between CD160 and genes involved in ribosomal
synthesis in the KEGG database. The results showed a high correlation between
CD160 and ribosome function in protein synthesis and signal transduction. To
further substantiate our hypothesis, we conducted an in-depth analysis of
large-scale single-cell data from COVID-19 patients. Throughout our comprehensive
analysis, we observed a notable overexpression of CD160 in both CD8
Blood cells constitute a diverse array of immune cells, forming the first line of defense against infectious and pathogenic microorganisms. The SD and COVID-19 samples used in this study were derived from peripheral blood. Hence, our objective was to explore the potential of mRNA samples in PBMCs as diagnostic biomarkers for SD-associated severe COVID-19. PBMCs represent an intrinsic circulating cell population, and cytokine storms constitute an inflammatory characteristic mechanism of PBMCs [47]. Rapid deterioration and high mortality risks associated with COVID-19 are primarily linked to cytokine storms [48]. Notably, numerous long non-coding RNAs (lncRNAs) can control cytokine transcription [49, 50, 51]. Recent transcriptomic studies on PBMCs from COVID-19 patients indicate markedly elevated expression levels of lncRNA-NEAT1 and lncRNA-TUG1 in patients with severe COVID-19 [52]. In vivo, lncRNA-NEAT1 participates in the activation and polarization of macrophages and T cells [53, 54], while lncRNA-TUG1 participates in macrophage cell cycle regulation and inflammatory response modulation [55]. These functions could potentially influence disease progression.
Recent longitudinal analysis has shown that in severe COVID-19 cases, there is a
consistent elevation of IFN-
Changes in lifestyle and behavioral patterns in modern society have led to a significant reduction in sleep duration. Reports suggest that short-term SD can trigger endocrine disruption and alterations in the balance of the immune system, resulting in a decreased immune defense and increased susceptibility to pathogen infection [43]. An animal study found that sleep and circadian rhythm disruptions can increase the risk of respiratory infections in mice [59]. Furthermore, clinical studies involving healthcare workers have shown that each additional hour of sleep can lower the susceptibility to SARS-CoV-2 infection by 12%, while those with severe sleep difficulties have an 88% elevated likelihood of contracting SARS-CoV-2 [60]. SD exerts a strong modulatory effect on peripheral inflammation levels of immune responses, rendering the body incapable of effectively combating pathogen attacks, thus increasing the risk of infection and disease [61].
Elevated levels of proinflammatory cytokines TNF-
The increase in circulating neutrophils and the decrease in lymphocytes are also considered markers of severe COVID-19 [50, 73, 74]. This aligns with our findings in SD and COVID-19 samples. SD exerts deleterious effects on the immune system, characterized by immune system dysregulation and changes in the dispersion of immune cells in the peripheral circulation. In SD patients, abnormal activation and release of various immune cells and factors may lead to an overactive and reactive immune system, resulting in a cytokine storm.
In the context of this immune dysregulation, if the individual becomes infected with COVID-19, an excessive release of inflammatory cytokines could lead to a systemic inflammatory response, ultimately increasing the risk of severe adverse events associated with COVID-19.
In our research, the diagnostic biomarkers CD160, QPCT, SIGLEC17P, and SLC22A4
have been identified as part of the gene set associated with SD-related severe
COVID-19. CD160 is a glycosylphosphatidylinositol (GPI)-anchored cell surface
glycoprotein, with an extracellular domain belonging to the immunoglobulin
superfamily (IgSF). It is observed in multiple immune cell species, including
CD8
SARS-CoV-2 is a pathogen that has led to a global pandemic, making the study of
the relationship between sleep deprivation (SD) and the progression of COVID-19
infection highly significant. Our research has discovered that in severe COVID-19
cases associated with SD, the downregulation of CD160 and SIGLEC17P expression
may alter the distribution of immune cells, leading to dysfunctions in NK cells
and CD8
Our study offers invaluable insights into the relationship between sleep deprivation (SD) and COVID-19, marking the first investigation into molecular biomarkers in blood samples associated with severe COVID-19 related to SD. The research also uncovered a specific gene, such as CD160, and its correlation with the severity of COVID-19 and its connection to sleep deprivation. This direct association of a particular gene with the disease state is a significant novel discovery, providing new biomarkers for understanding and treating COVID-19.
Despite the valuable insights bioinformatics brings to the study of SD-associated severe COVID-19, we acknowledge certain unavoidable limitations in our current research. Firstly, the sample size incorporated into the study was relatively limited (GSE37667) enrolling only nine individuals, which may lead to instability in the results, particularly in studies involving complex diseases. Secondly, due to the heterogeneity between different experimental platforms and sequencing techniques, technical variations and batch effects are present, potentially impacting the reliability and reproducibility of biomarkers. Additionally, our study may be constrained by the inherent limitations of the algorithms and statistical methods employed, such as overfitting or insufficient predictive power. Thirdly, our study may be confined to bioinformatics analysis of gene expression, lacking validation from in vivo and in vitro models, as well as support from prospective clinical studies. Therefore, we must place a heightened focus on the rationality of research design and assurance of data quality to guarantee the reliability and reproducibility of our research outcomes.
In summary, while sleep disorders represent one of the most common comorbidities
during the COVID-19 pandemic, comprehensive research investigating the
immunological connection between the two remains scarce to date. In a pioneering
ap-plication of bioinformatics techniques, we developed a risk prediction model
and subsequently confirmed the efficacy of CD160, QPCT, SIGLEC17P, and SLC22A4 as
diagnostic biomarkers for severe COVID-19 in the context of SD. Utilizing the
CIBERSORT method, we identified a positive correlation between CD160 and CD8
The datasets used and analyzed during the current study available from the corresponding author on reasonable request.
Conceptualization, JP and XZ; methodology, JP and XZ; software, JP and XZ; validation, JP, XZ, WZ, and HL; formal analysis, JP and XZ; investigation, JP and XZ; resources, JP and XZ; data curation, EW, WZ, and HL; writing—original draft preparation, JP and XZ; writing—review and editing, EW, WZ, and HL; visualization, EW and WZ; supervision, EW and HL; project administration, EW and HL; funding acquisition, EW and HL; “Supervision” means guiding research and ensuring quality, encompassing providing expert opinions and monitoring progress. “Project management” involves planning and coordinating the project to achieve objectives, including time management, team collaboration, and maintaining data quality and integrity. All authors read and approved the final manuscript. All authors have participated sufficiently in the work and agreed to be accountable for all aspects of the work.
This study was supported by the Medical Ethics Committee of Xiangya Hospital, Central South University (Ref. N.15725). All specimens were processed in compliance with relevant legal and ethical standards.
We acknowledge the technical support from the laboratory staff.
This work was supported by grants from the National Key Research and Development Program of China [Project No. 200YFC2005300]; Natural Science Foundation of Hunan Province [Project No. 2020JJ4900].
The authors declare no conflict of interest.
Publisher’s Note: IMR Press stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.