A predictive model for the risk of sepsis within 30 days of admission in patients with traumatic brain injury in the intensive care unit: a retrospective analysis based on MIMIC-IV database

Purpose Traumatic brain injury (TBI) patients admitted to the intensive care unit (ICU) are at a high risk of infection and sepsis. However, there are few studies on predicting secondary sepsis in TBI patients in the ICU. This study aimed to build a prediction model for the risk of secondary sepsis in TBI patients in the ICU, and provide effective information for clinical diagnosis and treatment. Methods Using the MIMIC IV database version 2.0 (Medical Information Mart for Intensive Care IV), we searched data on TBI patients admitted to ICU and considered them as a study cohort. The extracted data included patient demographic information, laboratory indicators, complications, and other clinical data. The study cohort was divided into a training cohort and a validation cohort. In the training cohort, variables were screened by LASSO (Least absolute shrinkage and selection operator) regression and stepwise Logistic regression to assess the predictive ability of each feature on the incidence of patients. The screened variables were included in the final Logistic regression model. Finally, the decision curve, calibration curve, and receiver operating character (ROC) were used to test the performance of the model. Results Finally, a total of 1167 patients were included in the study, and these patients were randomly divided into the training (N = 817) and validation (N = 350) cohorts at a ratio of 7:3. In the training cohort, seven features were identified as key predictors of secondary sepsis in TBI patients in the ICU, including acute kidney injury (AKI), anemia, invasive ventilation, GCS (Glasgow Coma Scale) score, lactic acid, and blood calcium level, which were included in the final model. The areas under the ROC curve in the training cohort and the validation cohort were 0.756 and 0.711, respectively. The calibration curve and ROC curve show that the model has favorable predictive accuracy, while the decision curve shows that the model has favorable clinical benefits with good and robust predictive efficiency. Conclusion We have developed a nomogram model for predicting secondary sepsis in TBI patients admitted to the ICU, which can provide useful predictive information for clinical decision-making.


Background
Traumatic brain injury (TBI) refers to impaired brain function or other brain pathological changes [1] caused by external forces, including concussion and traumatic brain hernia.At present, its incidence rate is the highest among all common nervous system diseases, and every year, 50 million to 60 million new TBI cases are reported worldwide, causing a huge public health burden [2].In 2016, the Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis 3.0) defined sepsis as a physiologically, pathologically and biochemically abnormal syndrome induced by infection, which is accompanied by acute organ dysfunction [3].In 2017, the number of sepsis patients was estimated to be 48.9 million, and the deaths of sepsis exceeded 11 million, accounting for 19.7% of the annual death toll [4].TBI patients in ICU need to receive more comprehensive treatment, including continuous intracranial pressure detection, decompressive craniectomy, early enteral nutrition, auxiliary ventilation, and fluid therapy to maintain arterial pressure and internal organ perfusion [5].They are at a high risk of drug-resistant bacteria infection and secondary sepsis [6].The high-risk factors for secondary sepsis among TBI patients in ICU are as follows: 1. hospital-acquired pneumonia (HAP) is the most common complication of long-term bed rest [7]; 2. because of long-term consciousness disorder and neurological deficit, TBI patients in ICU require long-term nursing and various invasive operations, such as tracheotomy, mechanical assisted ventilation [8], emergency operation, nasogastric tube [9], urinary catheter and deep vein catheterization [10], all of which are all high-risk factors for infection; 3. secondary stress ulcer, early epilepsy, and deep vein thrombosis after TBI lead to disease progression and prolong hospitalization [11].TBI patients admitted to ICU are in a state of consciousness disturbance for a long time and unable to feed back condition immediately.Therefore, the occurrence of infection and sepsis in these patients is usually insidious.As a result, it is usually delayed and challenging for clinicians to identify secondary sepsis in such patients.
Early identification of sepsis in TBI patients in the ICU is necessary.Clinical prediction models can provide effective information for clinicians to identify high-risk patients, make clinical decisions, and take countermeasures.However, there are few studies on the prediction of sepsis in TBI patients in the ICU.Hence, this study established a model for predicting the occurrence of sepsis in TBI patients in the ICU.The model has good prediction performance and can provide effective prediction information.

Data source
The data on patients diagnosed with TBI and admitted to ICU were extracted from MIMIC-IV 2.0 database (https:// physi onet.org/ conte nt/ mimic iv/2.0/), and the patients with intracranial injury in the database were identified based on ICD 9 and ICD 10 (ICD 10: S06; ICD 9: 85).To improve the simplicity of the model, we chose variables that were readily available in the clinic.The collected data include patient demographic data (gender, marital status, race, age), complications (acidosis, acute kidney injury, anemia, atrial fibrillation, depressive, diabetes, esophageal reflux, heart failure, hyperlipidemia, hypertension, thrombocytopenia, toxic encephalopathy, urinary tract infection), drug treatment information (dopamine, epinephrine, norepinephrine), operative procedure information (invasive ventilation, nasal gastric tube, urinary catheter), and laboratory indicators (lactate, basophils, eosinophils, lymphocytes, monocytes, neutrophils, anion gap, bicarbonate, calcium, creatinine, urea nitrogen, international normalized ratio, prothrombin time, activated partial thromboplastin time, hematocrit, hemoglobin, mean corpuscular hemoglobin, mean corpuscular hemoglobin concentration, mean corpuscular volume, platelets, red blood cell, red blood cell distribution width, white blood cells).For patients with multiple admissions, the first hospitalization data were used.For data from multiple examinations, data from the first examination within 24 h of admission were used.To prevent reverse causality, information on surgical procedures and medications after a patient developed sepsis was considered invalid and was not included in the analysis.Sepsis was diagnosed according to Sepsis 3.0 [3].Sepsis events after 30 days of admission were not included in the analysis.Those cases that were not admitted to ICU or had sepsis before admission to ICU and whose data were missed were excluded.Informed consent of patients was not required for this study because the database was approved by the Institutional Review Committee of MIT and Beth Israel Deaconess Medical Center.

Statistical analysis
The ''createDataPartition'' function of the caret software package was used to group patients into the training and validation cohorts at a ratio of 7:3, so that the outcome events were randomly distributed in the two cohorts.In order to prevent over-fitting of the model, most of the data were used to train the model to ensure its accuracy, while a small part of the data were used for validation.Variables were described in the training dataset and validation dataset, respectively.Categorical variables were described as percentiles (%); continuous variables of nonnormal distribution were displayed as medians and quartiles, and continuous variables of normal distribution were expressed as mean and standard deviation (mean (S.E.)).The chi-square test was used to compare the differences between categorical variables, and the t-test or nonparametric test was used to compare the differences between two groups of continuous variables.In the training cohort, LASSO regression and stepwise Logistic regression based on AIC (Akaike Information Criterion) were used for feature selection.Statistically significant variables (P < 0.05) were identified as independent risk factors and were included in the final logistic regression model, and a corresponding nomogram was plotted.The area under the ROC curve (AUC) was used to assess the prediction accuracy of the model; calibration curve was used to assess the consistency between the predicted value of the model and the actual value, and decision curve was used to analyze the clinical benefits of the model.Tableone software package was used for data description; glmnet software package was used for LASSO regression analysis; rms software package was used for plotting the nomogram and calibration curve, and pROC software package was used for plotting ROC curve.R 4.2.1 (https:// www.r-proje ct.org) was used for all statistical analysis.A two-sided P value < 0.05 was considered statistically significant.This study was designed and analyzed with reference to the TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis) statement [12].

Characteristics of the study cohort
A total of 5437 TBI patients were identified from the database.After those patients with missing data (N = 1681), not admitted to ICU (N = 2576), and diagnosed with sepsis before admission to ICU (N = 13) were excluded, 1167 patients (535 with secondary sepsis) were included in the study, including 817 (385 secondary sepsis) in the training cohort and 350 (150 with secondary sepsis) in the validation cohort.(Fig. 1) The study cohort was predominantly male (study cohort: 63.3%; training cohort: 64.1%; validation cohort: 61.4%).The median ages of the study cohort, training cohort, and validation cohort were 66, 65, and 66 years, respectively.Table 1 summarizes the demographic and clinical data of the study cohort.The variables in the training cohort and validation cohort were comparable with no statistically significant difference (P < 0.05).

Results of feature selection
The feature selection was conducted by using LASSO regression and stepwise Logistic regression.Figure 2 and Table 2 show the results of LASSO regression screening variables, and the x-coordinate at the top of Fig. 2 indicates the number of variables (dummy variables).The results showed that when λ was taken as the minimum value (0.02172893), 13 variables (Tables 2, 3) of 48 variables passed the screening and were included in the model (i.e., non-zero variables).
In order to ensure the simplicity of the model, these 13 variables were further screened by using stepwise Logistic regression based on AIC screening.The final results showed that AKI, anemia, invasive ventilation, GCS  score, lactic acid, and serum calcium levels were independent predictors of secondary sepsis in TBI patients in the ICU.In addition, the results showed that patients with AKI, anemia, moderate and severe disturbance of consciousness (GCS score ≤ 12), and invasive ventilation had a higher risk of sepsis.

Construction and validation of nomogram
Based on the screened features, a logistic regression model was constructed, and a nomogram was plotted (Fig. 3).Total points can be obtained by adding the scores of each variable in the nomogram, and the probability corresponding to the total score of the nomogram in the predictor (''Sepsis Risk'') is the probability of secondary sepsis in the patient.ROC curve, calibration curve, and decision curve were plotted to verify the model.ROC curve analysis results show that the AUC was 0.756 and 0.711 in the training cohort and the validation cohort (Fig. 4), respectively, indicating that the model had good discrimination ability.In the calibration curve, the y-coordinate indicates the actual incidence probability in the study cohort, and the x-coordinate indicates the estimated probability of the model.As shown in Fig. 5, the estimated probability has a high coincidence with the actual values, suggesting good consistency.In the clinical decision curve, the gray diagonal line indicates that all patients have received interventions; gray parallel line indicates that no patients have received intervention, and red (Fig. 6A) and blue curves (Fig. 6B) indicate the clinical benefits of the nomogram in the training cohort and validation cohort respectively.As shown in Fig. 6, our model has considerable net benefits in both cohorts.

Discussion
The study subjects in this study were TBI patients admitted to ICU.Based on these patients' demographic information, laboratory test indicators, and complications, a nomogram for predicting the risk of secondary sepsis in TBI patients was plotted.The results of feature selection showed that AKI, anemia, invasive ventilation, GCS score, lactic acid, and serum calcium level were important predictors of secondary sepsis in TBI patients in the ICU.The AUC of the nomogram model based on the above variables is greater than 0.7 in both training validation cohorts, indicating favorable prediction accuracy.
Sepsis is a life-threatening organ dysfunction caused by disordered body responses to infection [3].The severity of a patient's condition can be assessed in a standardized manner by using the Sequential Organ Failure Assessment (SOFA), which can reveal the direct relation between sepsis and mortality [13,14].It is estimated that AKI acute kidney injure, INR international normalized ratio, PT prothrombin time, PTT activated partial thromboplastin time, MCH mean corpuscular hemoglobin, MCHC mean corpuscular hemoglobin concentration, MCV mean corpuscular volume, RBC red blood cell, RDW red blood cell distribution width, WBC white blood cells, GCS Glasgow coma scale, NGT nasal gastric tube   to predict the occurrence of sepsis, and the AUC of this model can reach 0.91 [21].Alireza Rafiei et al. developed a sepsis prediction model by using the convolution neural network, which included the onset time of sepsis, with an AUC of greater than 0.8 [22].However, machine learning models have a ''black box'' effect, because it neither clearly shows the prediction process nor quantifies the prediction efficiency of each index.As such, clinicians do not trust these prediction results [23].Many previous studies have utilized ICD code as sepsis diagnosis criteria [20], but this practice may produce unreliable results [24,25].Therefore, we took a specific population (i.e., TBI patients) as subjects, whose onset risks were predicted specifically by using up-to-date international consensus as the diagnosis criteria in this study.Besides, we constructed a nomogram model based on stepwise Logistic regression.As a visualized model, a nomogram can quantify the influence of each prediction variable on the results and offer practical explanations [26].The nomogram model is simple and applicable and facilitates better and more efficient clinical decisions.
Our results showed that AKI, anemia, invasive ventilation, GCS score, lactic acid, and serum calcium level were significant predictors of secondary sepsis in TBI patients in the ICU.Furthermore, TBI patients with AKI have a higher risk of sepsis.For TBI patients, post-traumatic sympathetic nervous system activation, increased plasma catecholamine level, elevated systolic blood pressure, low blood volume, cytokine cascade reaction, and osmotic therapy of intracranial hypertension will bring a higher risk of kidney injury [27,28].The changes in intrathoracic pressure related to mechanical ventilation disrupt the systemic hemodynamics, resulting in biological damage such as decreased glomerular filtration rate, decreased creatinine clearance rate, and apoptosis of renal epithelial cells [29].AKI leads to metabolic dysfunction such as electrolyte disorder and acid-base disorder, thus impairing neutrophil functions and weakening infection-eliminating ability in patients [29].Animal experiments have confirmed that the lung recruitment of neutrophils with renal insufficiency is significantly weakened compared with that of normal neutrophils [30].The insufficiency  The risk of acute lung injury and infection in TBI patients is increased due to post-traumatic autoimmune and lung immune damage, neurogenic pulmonary edema, and impaired lung protective mechanisms following disturbance of consciousness [31].Studies have shown that 20-25% of TBI patients have respiratory failure, which is related to an increased oxygen demand or the ratio of arterial oxygen partial pressure to respiratory oxygen partial pressure (PaO 2 /FiO 2 < 300) [26].Ventilator-associated pneumonia (VAP) is one of the most common complications in TBI patients, with an incidence rate ranging from 23 to 60%.Tracheal intubation, tracheotomy, and ventilation support will increase the incidence of VAP [32].The use of antibiotics increases the likelihood of drug-resistant bacteria infection, thereby resulting in a higher risk of secondary sepsis in TBI patients.
Lactic acid is a commonly used biological marker for the diagnosis and prognosis of sepsis [33], and serves as a sign of tissue hypoxia.For sepsis patients with normal blood pressure, lactic acid of more than 4 mmol/L is independently associated with higher mortality.The patients who have moderate hyperlactacidemia (2-4 mmol/L and even high value (1.4-2.3 mmol/L) in the normal range) have a worse prognosis than those with normal lactic acid [34].Acidic extracellular environment will reduce myocardial contractility, cardiac output, blood pressure, and tissue perfusion, thereby leading to arrhythmia and weakening cardiovascular response to catecholamine [35], while high-dose catecholamine will aggravate hyperlactacidemia by reducing tissue perfusion or over-stimulating β2-adrenergic receptor.Therefore, tissue perfusion should be restored in the early stage of hyperlactacidemia to prevent further progression of the disease.
Due to fluid dilution caused by intravenous fluid resuscitation and traumatic bleeding (preoperative and perioperative periods), anemia is very common in TBI patients, especially in moderate and severe TBI patients [36].It also aggravates tissue hypoxia and is more likely to lead to acute bacterial infection, especially Gram-positive bacterial infection [37].Higher blood oxygen saturation and hemoglobin level and lower lactic acid level can significantly reduce the risk of death in patients with sepsis or septic shock [17].
Moreover, our results show that consciousness disturbance is one of the risk factors for sepsis, which is consistent with our hypothesis.TBI patients face an increased risk of HAP due to changes in mental state, dysphagia, vomiting, cough reflex, and secretion clearance disorder [31].Lower GCS scores are also associated with the incidence of VAP in TBI patients [32,38], which may be related to the fact that patients with moderate and severe TBI require open surgery, and respiratory support, and are susceptible to urinary tract infection.A single-center prospective cohort study including 900 patients found that lower GCS scores and higher APACHE (Acute Physiology and Chronic Health Evaluation) II scores are independent risk factors for secondary sepsis in TBI patients after operation [39].
Furthermore, the blood calcium level is included in many machine learning models for sepsis prediction [20].Critical diseases themselves are correlated with decreased serum total calcium and ionic calcium levels, and hypocalcemia also worsens with the increase in infection severity [40], which may be due to the increased sensitivity of parathyroid cells to blood calcium concentration [41].This indicates the role of blood calcium levels in predicting the risk of sepsis among infected patients.
With an AUC of greater than 0.7, our prediction model demonstrated favorable prediction efficiency and filled the gap in tools for predicting sepsis in TBI patients admitted to the ICU.Our prediction model enables clinicians to identify the risk of secondary sepsis in TBI patients at an early stage and develop targeted treatment plans according to risk factors, thus reducing the incidence of sepsis and improving the prognosis of patients.
However, this study has several limitations.First of all, our model has not been verified in an external cohort, and we will carry out further research in the future.Secondly, due to the limited types of variables in the public database, some variables of interest, such as cerebrospinal fluid examination and brain imaging data, were not included in the study.

Conclusion
AKI, anemia, invasive ventilation, GCS score, lactic acid, and serum calcium levels are significant predictors.We have developed a nomogram model for predicting secondary sepsis in TBI patients admitted to ICU.The model has a favorable prediction performance and can provide useful predictive information for clinical decision-making.

Fig. 1
Fig. 1 Flowchart of the study

Fig. 3
Fig.3 The predictive nomogram for the incidence of sepsis in patients with traumatic brain injury

Fig. 4 Fig. 5 A
Fig.4 The results of ROC curve analysis in the training set and the validation set

Table 1
Characteristics description of patients

Table 2
The screening results of Lasso regression

Table 3
Multivariate regression model based on LASSO regression and stepwise logistic regression analysis results