Surgical education is changing with a push for the inclusion of simulated environments for training and assessment. A critical element of this transition is linking performance in a lab based setting with the actual operating room. Currently there is no widely accepted, reliable tool for measuring surgical skill in the operating room. Ubiquitous video and imaging technology provide unique opportunities to develop metrics to meet this need. Hip fracture surgery is a promising area in which to develop these measures because hip fractures are common, the surgery is used as a milestone for residents, and it demands technical proficiency.
Resident surgeons wore a head-mounted video camera while performing surgical open reduction and internal fixation of hip fractures using a dynamic hip screw or telescoping screw plate. The wire navigation portion of the video was analyzed. Data collected from the video included: duration of wire navigation, number of fluoroscopic images acquired, and the degree of intervention by the surgeon’s supervisor. To determine the reliability of these measurements, four independent raters performed them for two cases. Ten raters independently measured the tip-apex distance (TAD), which reflects the accuracy of the surgical placement of the wire, on 7 cases. These metrics for 15 cases were then compared to experience metrics including point in residency and number of previous cases performed. A composite performance score was computed by summing the average standardized values of the four performance metrics. Expert surgeon opinion, the Objective Structured Assessment of Technical Skills (OSATS) score of two traumatologists, was compared with these metrics.
The inter-rater reliability analysis for all video-based measures produced a Cronbach’s Alpha of 0.99 and for the combined TAD measurements a Cronbach’s Alpha of 0.97. There was significant correlation between surgical experience and both procedure duration and tip-apex distance. The composite performance metric significantly correlated to both weeks into residency -0.55 (p=0.03) and cases logged - 0.66 (p=0.01).The OSATS score was only significantly correlated to surgery duration and number of fluoroscopic images.
Several of the video-based metrics and TAD measurement were consistent across the raters and are useful for performance assessment. The wire navigation performance metrics, time and TAD, were shown to differentiate surgical experience. A composite score incorporating multiple performance metrics also provided strong correlations with surgical experience. The methods presented have the potential for truly objective assessment of resident technical performance in the operating room, a critical step towards competency based education.