Background and Objective: Markerless vision-based Pose Estimation (PE) is apromising avenue towards scalable data collection in rehabilitation. Deploying this technology will require self-contained systems able to process data efficiently and accurately. The aims of this work are to 1) Determine how depth data affects lightweight monocular Red-Green-Blue (RGB) PE performance (accuracy and speed), in order to inform sensor selection. 2) Validate PE models using data from individuals with physical impairments.
Methods: Versions of 2D and 3D PE models with and without depth data werecompared using public datasets and a custom dataset including individuals post-stroke.
Results: An early fusion architecture provided the best joint localization accuracy whileachieving similar frame rates to its RGB counterpart. Motor impairment did not have a significant effect on the accuracy of any model.
Conclusion: Including depth data improves the accuracy-efficiency tradeoff. PEaccuracy is not affected by the presence of physical impairments.