With the rapid development of deep learning techniques, the application of Convolutional Neural Network (CNN) has benefited the task of target object recognition. Several state-of-the-art object detectors have achieved excellent performance on the precision for object recognition. When it comes to applying the detection results for the real world application of collaborative robots, the reliability and robustness of the target object detection stage is essential to support efficient task allocation. In this work, collaborative robots task allocation is based on the assumption that each individual robotic agent possesses specialized capabilities to be matched with detected targets representing tasks to be performed in the surrounding environment which impose specific requirements. The goal is to reach a specialized labor distribution among the individual robots based on best matching their specialized capabilities with the corresponding requirements imposed by the tasks. In order to further improve task recognition with convolutional neural networks in the context of robotic task allocation, this thesis proposes an innovative approach for progressively refining the target detection process by taking advantage of the fact that additional images can be collected by mobile cameras installed on robotic vehicles. The proposed methodology combines a CNN-based object detection module with a refinement module. For the detection module, a twostage object detector, Mask RCNN, for which some adaptations on region proposal generation are introduced, and a one-stage object detector, YOLO, are experimentally investigated in the context considered. The generated recognition scores serve as input for the refinement module. In the latter, the current detection result is considered as the a priori evidence to enhance the next detection for the same target with the goal to iteratively improve the target recognition scores. Both the Bayesian method and the Dempster-Shafer theory are experimentally investigated to achieve the data fusion process involved in the refinement process. The experimental validation is conducted on indoor search-and-rescue (SAR) scenarios and the results presented in this work demonstrate the feasibility and reliability of the proposed progressive refinement framework, especially when the combination of adapted Mask RCNN and D-S theory data fusion is exploited.