Background: Vertebral fracture assessment (VFA) images are acquired in dual-energy (DE) or single-energy (SE) scan modes. Automated identification of vertebral compression fractures, from VFA images acquired using GE Healthcare scanners in DE mode, has achieved high accuracy through the use of convolutional neural networks (CNNs). Due to differences between DE and SE images, it is uncertain whether CNNs trained on one scan mode will generalize to the other.
Purpose: To evaluate the ability of CNNs to generalize between GE DE and GE SE VFA scan modes.
Methods: 12,742 GE VFA images from the Manitoba Bone Mineral Density Program, obtained between 2010 and 2017, were exported in both DE and SE modes. VFAs were classified by imaging specialists as fracture present or absent using the modified algorithm-based qualitative (mABQ) method. VFA scans were randomly divided into independent training (60%), validation (10%), and test (30%) sets. Three CNN models were constructed by training separately on DE only, SE only, and a composite dataset comprised of both SE and DE VFAs. All three trained CNN models were separately evaluated against both SE and DE test datasets.
Results: Good performance was seen for CNNs trained and evaluated on the same scan mode. DE scans used for both training and evaluation (DE/DE) achieved 87.9% sensitivity, 87.4% specificity, and an area under the receiver operating characteristic curve (AUC) of 0.94. SE scans used for both training and evaluation (SE/SE) achieved 78.6% sensitivity, 90.6% specificity, AUC = 0.92. Conversely, CNNs performed poorly when evaluated on scan modes that differed from their training sets (AUC = 0.58). However, a composite CNN trained simultaneously on both SE and DE VFAs gave performance comparable to DE/DE (82.4% sensitivity, 94.3% specificity, AUC = 0.95); and provided improved performance over SE/SE (82.2% sensitivity, 92.3% specificity, AUC = 0.94). Positive predictive value was higher with the composite CNN compared with models trained solely on DE (74.5% vs. 58.7%) or SE VFAs (68.6% vs. 62.9%).
Conclusion: CNNs for vertebral fracture identification are highly sensitive to scan mode. Training CNNs on a composite dataset, comprised of both GE DE and GE SE VFAs, allows CNNs to generalize to both scan modes and may facilitate the development of manufacturer-independent machine learning models for vertebral fracture detection.