Methods and reliability of radiographic vertebral fracture detection in older men: the osteoporotic fractures in men study.
Peggy M Cawthon, Jane Haslam, Robin Fullman, Katherine W Peters, Dennis Black, Kristine E Ensrud, Steven R Cummings, Eric S Orwoll, Elizabeth Barrett-Connor, Lynn Marshall, Peter Steiger, John T Schousboe, Osteoporotic Fractures in Men (MrOS) Research Group
Author Information
Peggy M Cawthon: California Pacific Medical Center Research Institute, USA. Electronic address: pcawthon@sfcc-cpmc.net.
Jane Haslam: Optasia Medical, UK.
Robin Fullman: California Pacific Medical Center Research Institute, USA.
Katherine W Peters: California Pacific Medical Center Research Institute, USA.
Dennis Black: University of California, San Francisco, USA.
Kristine E Ensrud: University of Minnesota, USA; Minneapolis VA Health System, USA.
Steven R Cummings: California Pacific Medical Center Research Institute, USA.
Eric S Orwoll: Oregon Health and Science University, USA.
Elizabeth Barrett-Connor: University of California, San Diego, USA.
Lynn Marshall: Oregon Health and Science University, USA.
Peter Steiger: Optasia Medical, UK.
John T Schousboe: Park Nicollet Institute for Research and Education, Division of Health Policy and Management, University of Minnesota, USA.
We describe the methods and reliability of radiographic vertebral fracture assessment in MrOS, a cohort of community dwelling men aged ≥65yrs. Lateral spine radiographs were obtained at Visit 1 (2000-2) and 4.6years later (Visit 2). Using a workflow tool (SpineAnalyzer™, Optasia Medical), a physician reader completed semi-quantitative (SQ) scoring. Prior to SQ scoring, technicians performed "triage" to reduce physician reader workload, whereby clearly normal spine images were eliminated from SQ scoring with all levels assumed to be SQ=0 (no fracture, "triage negative"); spine images with any possible fracture or abnormality were passed to the physician reader as "triage positive" images. Using a quality assurance sample of images (n=20 participants; 8 with baseline only and 12 with baseline and follow-up images) read multiple times, we calculated intra-reader kappa statistics and percent agreement for SQ scores. A subset of 494 participants' images was read regardless of triage classification to calculate the specificity and sensitivity of triage. Technically adequate images were available for 5958 of 5994 participants at Visit 1, and 4399 of 4423 participants at Visit 2. Triage identified 3215 (53.9%) participants with radiographs that required further evaluation by the physician reader. For prevalent fractures at Visit 1 (SQ≥1), intra-reader kappa statistics ranged from 0.79 to 0.92; percent agreement ranged from 96.9% to 98.9%; sensitivity of the triage was 96.8% and specificity of triage was 46.3%. In conclusion, SQ scoring had excellent intra-rater reliability in our study. The triage process reduces expert reader workload without hindering the ability to identify vertebral fractures.