Bone stress injury (BSI) risk in runners is multifactorial and not well understood. Unsupervised machine learning approaches can potentially elucidate risk factors for BSI by identifying groups of similar runners within a population which differ in BSI incidence. Here, a hierarchical clustering approach is used to identify groups of collegiate cross country runners based on 2-dimensional frontal plane pelvis and proximal femur geometry, which was extracted from dual-energy X-ray absorptiometry scans and dimensionally reduced by principal component analysis. Seven distinct groups were identified using the cluster tree, with the initial split being highly related to female-male differences. Visual inspection revealed clear differences between groups in pelvis and proximal femur geometry, and groups were found to differ in lower body BSI incidence during the subsequent academic year (Rand index = 0.53; adjusted Rand index = 0.07). Linear models showed between-cluster differences in visually identified geometric measures. Geometric measures were aggregated into a pelvis shape factor based on trends with BSI incidence, and the resulting shape factor was significantly different between clusters (p < 0.001). Lower shape factor values, corresponding with lower pelvis height and ischial span, and greater iliac span and trochanteric span, appeared to be related to increased BSI incidence. This trend was dominated by the effect observed across clusters of male runners, indicating that geometric effects may be more relevant to BSI risk in males, or that other factors masked the relationship in females. More broadly, this work outlines a methodological approach for distilling complex geometric differences into simple metrics that relate to injury risk.
Keywords:
Stress fracture; Machine learning; Dual x-ray absorptiometry; Principal component analysis