Journal of International Oncology ›› 2023, Vol. 50 ›› Issue (4): 220-226.doi: 10.3760/cma.j.cn371439-20221214-00043

• Original Articles • Previous Articles     Next Articles

Construction of machine learning models for predicting the risk of postoperative distant metastasis recurrence in serous ovarian cancer

Yang Lirong, Wang Yufeng()   

  1. Department of Geriatric Oncology, Yunnan Cancer Hospital, Third Affiliated Hospital of Kunming Medical University, Kunming 650100, China
  • Received:2022-12-14 Revised:2023-03-13 Online:2023-04-08 Published:2023-06-12
  • Contact: Wang Yufeng, Email: 13577037585@163.com

Abstract:

Objective To develop a machine model to predict the risk of postoperative distant metastasis recurrence in serous ovarian cancer (SOC) based on routine clinical data. Methods Participants included 687 patients with recurrent SOC who underwent surgery at Yunnan Cancer Hospital from January 2010 to December 2020. According to the recurrence status, the patients were divided into the distant metastasis group (n=105) and the non-distant metastasis group (n=582). Logistic regression was used to screen the variables related to distant metastasis of SOC. Based on these selected variables, five machine learning methods including K-nearest neighbor (KNN), logistic regression (LR), random forest (RF), support vector machine (SVM) and extreme gradient boosting (XGBoost) were used to develop the postoperative distant metastasis risk prediction model of SOC. For model validation, the 10-fold cross-validation method was used for internal validation. The performance of the models was evaluated using the receiver operating characteristic curve. Results There were statistically significant differences in International Federation of Gynecology and Obstetrics (FIGO) stage (Z=-3.81, P<0.001), perioperative chemotherapy cycle (t=-5.11, P<0.001), lymph node metastasis (χ2=5.98, P=0.014), peritoneal effusion cytology (Z=-2.22, P=0.026), and neoadjuvant chemotherapy (χ2=5.29, P=0.021) between patients in the distant metastasis group and the non-distant metastasis group. Multivariate regression analysis showed that the FIGO stage (OR=1.54, 95%CI: 1.07-2.22, P=0.019) and perioperative chemotherapy cycle (OR=1.22, 95%CI: 0.09-0.36, P<0.001) were independent influencing factors for postoperative distant metastasis recurrence in SOC. Peritoneal effusion cytology (OR=1.20, 95%CI: 0.71-1.89, P=0.180) was not an independent influencing factor for distant metastasis of SOC. It was ultimately included in the construction of the model, for its inclusion could improve the area under the curve (AUC) of the model. Among the five machine learning models constructed based on the above three variables, the KNN-based model had the best performance in identifying distant metastasis of SOC, with the AUC of 0.750, sensitivity of 0.591, specificity of 0.786, and accuracy of 85.0%. The AUC of the LR model was 0.679, the sensitivity was 0.545, the specificity was 0.765, and the accuracy was 84.3%. The AUC of SVM model was 0.634, the sensitivity was 0.240, the specificity was 0.968, and the accuracy was 84.7%. The AUC of RF model was 0.575, the sensitivity was 0.905, the specificity was 0.245, and the accuracy was 84.7%. The AUC of XGBoost model was 0.704, the sensitivity was 0.567, the specificity was 0.745, and the accuracy was 84.9%. Conclusion FIGO stage and perioperative chemotherapy cycle are independent influencing factors for postoperative distant metastasis recurrence in SOC. The KNN model established based on FIGO stage, perioperative chemotherapy cycle and peritoneal effusion cytology has high discrimination degree and accuracy rate in predicting postoperative distant metastasis recurrence of SOC.

Key words: Ovarian neoplasms, Recurrence, Neoplasm metastasis, Machine learning, Risk factors