Article Text

Download PDFPDF

EPV097/#140 Application of a machine learning algorithm to identify predictors of recurrence and recurrence free survival in high grade endometrial cancer
  1. S Piedimonte1,
  2. T Feigenberg2,
  3. B Cormier3,
  4. J Kwon4,
  5. W Gotlieb5,
  6. M Plante6,
  7. S Lau5,
  8. L Helpman7,
  9. MC Renaud6,
  10. T May8 and
  11. D Vicus9
  1. 1University of Toronto, Gynecologic Oncolgoy, Toronto, Canada
  2. 2Trillium Health Partners, Gynecologic Oncolgoy, missassauga, Canada
  3. 3Centre hospitalier de l’Université de Montréal, Gynecologic Oncology, Montreal, Canada
  4. 4Vancouver General Hospital, Gynecologic Oncolgoy, Vancouver, Canada
  5. 5McGill University, Jewish General Hospital, Gynecology Oncology, Montreal, Canada
  6. 6Hotel Dieu de Quebec, Gynecology Oncology, Quebec, Canada
  7. 7Juravinski Cancer Center, Gynecologic Oncology, Hamilton, Canada
  8. 8Princess Margaret Cancer Centre/University of Health Network/Sinai Health Systems, Gynecologic Oncology, Toronto, Canada
  9. 9Sunnybrook Health Sciences Centre, Gynecologic Oncology, Toronto, Canada


Objectives To train various machine learning algorithms to predict recurrence and recurrence-free survival (RFS) in high-grade endometrial cancer (HGEC)

Methods Data was retrospectively collected across 8 Canadian centers including 1237 patients and divided arbitrarily 50% training, 25% validation and 25% testing. Four models were trained to predict recurrence: random forests, boosted trees, and 2 neural networks. Receiver operating characteristic curves (ROC) were used to determine model performance and select the best model based on highest area under the curve (AUC) in the test set. For time to recurrence models, we trained a random forest and Lasso model compared to Cox Proportional hazards. Concordance was reported using a c-statistic.

Results Among the 4 models tested, the bootstrap random forest had the best AUC in the test set and was the best model to predict recurrence in HGEC; the AUCs were 85.2%, 74.1% and 71.8% in the training, validation and test sets respectively. The top 5 predictors were: stage, uterus height, specimen weight, adjuvant chemotherapy and pre-operative histology. When stratified by stage, the AUC in the test set increased to 77% for Stage III and 80% for Stage IV. For time to recurrence, there was no difference between the Lasso and Cox Proportional Hazards models (test set c-index 71%) while the random forest had a c-index of 60.5%.

Conclusions A bootstrap random forest model best predicted recurrence in HGEC; model prediction further improved in Stage III and IV patients. Machine learning survival models performed similar to Cox Proportional Hazards but could be conducted with greater efficiency.

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.