Objectives Measurement of Response Evaluation Criteria In Solid Tumors (RECIST) relies on reproducible unidimensional tumor measurements. This study assessed intraobserver and interobserver variability of target lesion selection and measurement, according to RECIST version 1.1 in patients with ovarian cancer.
Methods Eight international radiologists independently viewed 47 images demonstrating malignant lesions in patients with ovarian cancer and selected and measured lesions according to RECIST V.1.1 criteria. Thirteen images were viewed twice. Interobserver variability of selection and measurement were calculated for all images. Intraobserver variability of selection and measurement were calculated for images viewed twice. Lesions were classified according to their anatomical site as pulmonary, hepatic, pelvic mass, peritoneal, lymph nodal, or other. Lesion selection variability was assessed by calculating the reproducibility rate. Lesion measurement variability was assessed with the intra-class correlation coefficient.
Results From 47 images, 82 distinct lesions were identified. For lesion selection, the interobserver and intraobserver reproducibility rates were high, at 0.91 and 0.93, respectively. Interobserver selection reproducibility was highest (reproducibility rate 1) for pelvic mass and other lesions. Intraobserver selection reproducibility was highest (reproducibility rate 1) for pelvic mass, hepatic, nodal, and other lesions. Selection reproducibility was lowest for peritoneal lesions (interobserver reproducibility rate 0.76 and intraobserver reproducibility rate 0.69). For lesion measurement, the overall interobserver and intraobserver intraclass correlation coefficients showed very good concordance of 0.84 and 0.94, respectively. Interobserver intraclass correlation coefficient showed very good concordance for hepatic, pulmonary, peritoneal, and other lesions, and ranged from 0.84 to 0.97, but only moderate concordance for lymph node lesions (0.58). Intraobserver intraclass correlation coefficient showed very good concordance for all lesions, ranging from 0.82 to 0.99. In total, 85% of total measurement variability resulted from interobserver measurement difference.
Conclusions Our study showed that while selection and measurement concordance were high, there was significant interobserver and intraobserver variability. Most resulted from interobserver variability. Compared with other lesions, peritoneal lesions had the lowest selection reproducibility, and lymph node lesions had the lowest measurement concordance. These factors need consideration to improve response assessment, especially as progression free survival remains the most common endpoint in phase III trials.
- Ovarian Cancer
Data availability statement
No data are available. The data are included in the analysis in the study.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
Presented at This has been presented at ANZGOG virtually in 2021 and in part at ASCO 2018.
Contributors TC, SL, MKW, LW, and AMO contributed to the design of the trial. DM, LB, HM, MO, AK, AH, and CM contributed to the study assessments. H-WS, MK, and LW contributed to the statistical analysis. All authors contributed to and reviewed the manuscript and approved for submission. MW acts as guarantor for the study.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial, or not-for-profit sectors.
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.