国际肿瘤学杂志 ›› 2024, Vol. 51 ›› Issue (5): 267-273.doi: 10.3760/cma.j.cn371439-20230621-00045

• 论著 • 上一篇    下一篇

基于SMOTE算法的食管癌放化疗患者肺部感染的预后模型构建

刘静(), 刘芹, 黄梅   

  1. 四川省南充市中心医院中西医结合科,南充 637000
  • 收稿日期:2023-06-21 修回日期:2024-02-07 出版日期:2024-05-08 发布日期:2024-06-26
  • 通讯作者: 刘静, Email:ljhjvjzx@163.com

Prognostic model construction of lung infection in patients with chemoradiotherapy for esophageal cancer based on SMOTE algorithm

Liu Jing(), Liu Qin, Huang Mei   

  1. Department of Integrated Chinese and Western Medicine, Nanchong Central Hospital of Sichuan Province, Nanchong 637000, China
  • Received:2023-06-21 Revised:2024-02-07 Online:2024-05-08 Published:2024-06-26
  • Contact: Liu Jing, Email:ljhjvjzx@163.com

摘要:

目的 探索食管癌放化疗患者肺部感染的独立危险因素,并基于合成少数派过采样技术(SMOTE)算法建立个体化预测模型。方法 选取2016年1月至2022年3月四川省南充市中心医院收治的行同步放化疗的197例食管癌患者作为研究对象。根据患者治疗期间是否发生肺部感染分为感染组(n=23)和未感染组(n=174)。采用单因素和二元logistic回归分析方法筛选患者肺部感染的独立危险因素,并建立logistic回归模型(P1),同时基于SMOTE算法改进数据集,构建改进数据集的预测模型(P2),并采用受试者操作特征(ROC)曲线对比模型的预测效能。结果 197例患者肺部感染发生率为11.7%(23/197)。单因素分析表明,感染组与未感染组患者年龄(t=3.53,P=0.001)、吸烟指数≥200年支患者比例(χ2=7.64,P=0.006)、伴有放射性肺损伤患者比例(χ2=5.41,P=0.020)、合并糖尿病患者比例(χ2=6.71,P=0.010)、合并慢性阻塞性肺疾病患者比例(χ2=3.92,P=0.048)及第1秒用力呼气容积与用力肺活量比值(FEV1/FVC)(t=3.93,P<0.001)差异均具有统计学意义。Logistic回归多因素分析发现,患者年龄增加(OR=1.09,95%CI为1.02~1.16,P=0.008)、FEV1/FVC降低(OR=0.92,95%CI为0.87~0.98,P=0.005)、合并糖尿病(OR=3.29,95%CI为1.22~8.91,P=0.019)、吸烟指数≥200年支(OR=4.02,95%CI为1.42~11.41,P=0.009)及合并放射性肺损伤(OR=4.75,95%CI为1.26~17.85,P=0.021)是食管癌患者同步放化疗过程中发生肺部感染的独立危险因素。概率预测模型logit(P1)=-2.760+0.084×年龄-0.081×FEV1/FVC+1.191×糖尿病+1.392×吸烟指数+1.558×放射性肺损伤。预测模型logit(P2)=-1.544-0.127×年龄- 0.115×FEV1/FVC+1.599×糖尿病+1.434×吸烟指数+1.748×放射性肺损伤。ROC曲线分析提示,P1P2模型的敏感性分别为0.826、0.897,特异性分别为0.747、0.793,约登指数分别为0.573、0.690。P2模型的曲线下面积为0.903(95%CI为0.872~0.934),显著高于P1模型的0.843(95%CI为0.763~0.923),差异具有统计学意义(Z=13.23,P=0.002)。结论 患者年龄增加、FEV1/FVC降低、吸烟指数≥200年支及合并糖尿病和放射性肺损伤与食管癌患者同步放化疗过程中发生肺部感染密切相关。通过SMOTE算法建立的个体化预测模型可明显提升患者发生肺部感染的预测效能。

关键词: 食管肿瘤, 肺炎, 危险因素

Abstract:

Objective To explore the independent risk factors of lung infection in patients with esophageal cancer treated with chemoradiotherapy and to establish an individualized early warning model based on synthetic minority oversampling technique (SMOTE) algorithm. Methods A total of 197 patients with esophageal cancer who received concurrent chemoradiotherapy in Nanchong Central Hospital of Sichuan Province from January 2016 to March 2022 were selected as the study objects. Patients were categorized into the infected group (n=23) and the uninfected group (n=174) according to whether they developed lung infection during treatment. The clinical data of patients in both groups were collected, and independent risk factors for lung infection were screened using univariate and binary logistic regression analysis, and a logistic regression model (P1) was established, while an early warning model (P2) was constructed based on the improved dataset with the SMOTE algorithm, and the predictive efficiency of the model was compared by receiver operator characteristic (ROC) curve. Results The incidence of lung infection in 197 patients was 11.7% (23/197), Univariate analysis showed that there were statistically significant differences in the age (t=3.53, P=0.001), the proportion of patients with a smoking index of ≥200 cigarette-years (χ2=7.64, P=0.006), the proportion of patients with concomitant radiological lung injury (χ2=5.41, P=0.020), the proportion of patients with comorbid diabetes mellitus (χ2=6.71, P=0.010), the proportion of patients with chronic obstructive lung disease (χ2=3.92, P=0.048) and forced expiratory volume in one second/forced vital capacity (FEV1/FVC) (t=3.93, P<0.001) of patients between the infected group and the uninfected group. Logistic regression multivariate analysis found that increasing patient age (OR=1.09, 95%CI: 1.02-1.16, P=0.008), decreased FEV1/FVC (OR=0.92, 95%CI: 0.87-0.98, P=0.005), combined diabetes mellitus (OR=3.29, 95%CI: 1.22-8.91, P=0.019), smoking index ≥200 cigarette-years (OR=4.02, 95%CI: 1.42-11.41, P=0.009) and combined radiation lung injury (OR=4.75, 95%CI: 1.26-17.85, P=0.021) were independent risk factors for the occurrence of lung infection during simultaneous chemoradiotherapy in patients with esophageal cancer. Probabilistic prediction model logit(P1)=-2.760+0.084×age-0.081×FEV1/FVC+1.191×diabetes+1.392×smoking index+1.558×radiation lung injury. The early warning model logit(P2)=-1.544-0.127×age-0.115×FEV1/FVC+1.599×diabetes+1.434×smoking index+1.748×radiation lung injury. ROC curve analysis showed that the sensitivity of P1 and P2 models were 0.826 and 0.897, the specificity were 0.747 and 0.793, and the Youden index were 0.573 and 0.690, respectively. The area under curve of P2 model was 0.903 (95%CI: 0.872-0.934), which was significantly higher than 0.843 (95%CI: 0.763-0.923) of P1 model, with a statistically significant difference (Z=13.23, P=0.002). Conclusion Increasing patient age, decreased FEV1/FVC, smoking index ≥200 cigarette-years, combined diabetes mellitus and radiation lung injury are strongly associated with the occurrence of lung infection during simultaneous chemoradiotherapy in patients with esophageal cancer. The individualized early warning model established by SMOTE algorithm can significantly improve the predictive efficacy of patients' occurrence of lung infection.

Key words: Esophageal neoplasms, Pneumonia, Risk factors