国际肿瘤学杂志 ›› 2026, Vol. 53 ›› Issue (4): 213-218.doi: 10.3760/cma.j.cn371439-20250626-00035

• 论著 • 上一篇    下一篇

基于SMOTE算法的食管癌患者术后急性呼吸衰竭风险预测模型构建

谢文娟, 朱媛(), 许晶   

  1. 南京医科大学第一附属医院(江苏省人民医院)胸外科南京 210003
  • 收稿日期:2025-06-26 出版日期:2026-04-08 发布日期:2026-04-01
  • 通讯作者: 朱媛,Email: zyxwk3235@163.com
  • 基金资助:
    南京医科大学部省共建肿瘤个体化医学协同创新中心-恒瑞医药临床研究基金南医协创办〔2024〕第4号

Construction of a prediction model for postoperative acute respiratory failure risk in patients with esophageal cancer based on SMOTE algorithm

Xie Wenjuan, Zhu Yuan(), Xu Jing   

  1. Department of Thoracic SurgeryFirst Affiliated Hospital with Nanjing Medical University (Jiangsu Province Hospital)Nanjing 210003, China
  • Received:2025-06-26 Online:2026-04-08 Published:2026-04-01
  • Contact: Zhu Yuan, Email: zyxwk3235@163.com
  • Supported by:
    Clinical Research Fund of Hengrui Pharmaceuticals, Tumor Individualized Medicine Collaborative Innovation Center Co-constructed by the Ministry and Province of Nanjing Medical UniversityNJMI-CCICO〔2024〕4

摘要:

目的 基于合成少数类过采样技术(SMOTE)算法构建食管癌患者术后发生急性呼吸衰竭(ARF)风险的预测模型。方法 回顾性分析2023年3月至2025年3月在江苏省人民医院行手术治疗的450例食管癌患者的临床资料,根据术后是否发生ARF分为ARF组(45例)和无ARF组(405例)。比较两组的临床资料,通过多因素logistic回归分析筛选食管癌患者术后发生ARF的影响因素,构建logistic回归预测模型,基于SMOTE算法构建食管癌患者术后发生ARF的预测模型,通过计算Cox-Snell R²评估模型的解释力。采用受试者操作特征(ROC)曲线评估其预测效能。结果 ARF组与无ARF组的吸烟史、手术时间、吻合口渗漏、术后痰液堵塞、术后并发肺炎、术后机械通气时间差异均有统计学意义(均P<0.05)。多因素分析结果显示,有吸烟史(OR=3.57,95%CI为1.60~7.97,P=0.002)、手术时间>4 h(OR=2.89,95%CI为1.49~5.98,P=0.002)、吻合口渗漏(OR=3.09,95%CI为1.04~9.17,P=0.042)、术后痰液堵塞(OR=2.69,95%CI为1.34~5.41,P=0.005)、术后并发肺炎(OR=2.61,95%CI为1.24~5.50,P=0.011)、术后机械通气时间>48 h(OR=4.26,95%CI为1.68~10.80,P=0.002)均是食管癌患者术后发生ARF的独立危险因素。基于SMOTE算法构建的预测模型为logit(P)=-4.74+2.90×吸烟史+2.52×手术时间+1.69×吻合口渗漏+1.51×术后痰液堵塞+1.49×术后并发肺炎+1.88×术后机械通气时间。Cox-Snell R2=0.537(χ²=118.34,df=6,P<0.001),表明基于SMOTE算法的预测模型具有较好的解释力。ROC曲线分析显示,基于SMOTE算法的预测模型预测食管癌患者术后发生ARF的曲线下面积为0.835(95%CI为0.803~0.866),高于logistic回归预测模型(曲线下面积为0.783,95%CI为0.712~0.854;Z=2.35,P=0.019)。结论 基于SMOTE算法构建的食管癌患者术后发生ARF风险预测模型具有良好的预测效能,可用于识别食管癌术后ARF高风险患者。

关键词: 食管肿瘤, 呼吸功能不全, 比例危险度模型, SMOTE算法

Abstract:

Objective To construct a prediction model for postoperative acute respiratory failure (ARF) risk in patients with esophageal cancer based on the synthetic minority oversampling technique (SMOTE) algorithm. Methods A retrospective analysis was conducted on the clinical data of 450 patients who underwent surgical treatment for esophageal cancer in Jiangsu Province Hospital from March 2023 to March 2025. Based on the occurrence of ARF after surgery, the patients were divided into an ARF group (45 cases) and a non-ARF group (405 cases). The clinical data were compared between patients in the two groups, and multivariate logistic regression analysis was used to identify influencing factors for ARF in patients with esophageal cancer after surgery. A logistic regression prediction model was constructed. A prediction model for the occurrence of ARF in patients with esophageal cancer after surgery was constructed using the SMOTE algorithm. Model calibration was assessed using Cox-Snell R². The predictive efficacy was evaluated using the receiver operator characteristic (ROC) curve. Results There were statistically significant differences in smoking history, surgery duration, anastomotic leakage, postoperative sputum obstruction, postoperative pneumonia, and postoperative mechanical ventilation duration between the ARF group and the non-ARF group (all P<0.05). Multivariate analysis showed that smoking history (OR=3.57, 95%CI: 1.60-7.97, P=0.002), surgery duration >4 h (OR=2.89, 95%CI: 1.49-5.98, P=0.002), anastomotic leakage (OR=3.09, 95%CI: 1.04-9.17, P=0.042), postoperative sputum obstruction (OR=2.69, 95%CI: 1.34-5.41, P=0.005), postoperative pneumonia (OR=2.61, 95%CI: 1.24-5.50, P=0.011), and postoperative mechanical ventilation duration >48 h (OR=4.26, 95%CI: 1.68-10.80, P=0.002) were independent risk factors for ARF in patients with esophageal cancer after surgery. The prediction model based on the SMOTE algorithm was logit(P)=-4.74+2.90×smoking history+2.52×surgery duration+1.69×anastomotic leakage+1.51×postoperative sputum obstruction+1.49×postoperative pneumonia+1.88×postoperative mechanical ventilation duration. The prediction model based on the SMOTE algorithm demonstrated good calibration (Cox-Snell R2=0.537; χ²=118.34, df=6, P<0.001). The ROC curve analysis showed that the area under the curve of the prediction model based on the SMOTE algorithm was 0.835 (95%CI: 0.803-0.866), which was higher than that of the logistic regression prediction model (the area under the curve was 0.783, 95%CI: 0.712-0.854; Z=2.35, P=0.019). Conclusions The risk prediction model for ARF after esophageal cancer surgery based on the SMOTE algorithm exhibits excellent predictive efficacy, supporting its potential utility in identifying patients with high risk of ARF after esophageal cancer surgery.

Key words: Esophageal neoplasms, Respiratory insufficiency, Proportional hazards models, SMOTE algorithm