A common path planning in robotics is the artificial potential field method. The artificial potential field is the superposition of the attractive potential field generated by the target and the repulsive potential field generated by the obstacles. The total force on a robot moving in the artificial potential field is the sum of the attractive force from the attractive field and the repulsive force from the repulsive potential field. The robot then moves in the direction of the total force, whose direction is along the negative gradient of the artificial potential field. If the artificial potential field has a unique minimum at the target, the robot will reach it without hitting obstacles. However, if the artificial potential field has multiple minima, the robot may arrive at a location with locally minimum potential. The total force on the robot is zero at such a local minimum and drives the robot towards it in its neighbourhood. Therefore, the robot becomes stationary and cannot arrive at the target.
Deep Q-learning network has been proposed to overcome the local minima problem of robot path planning based on artificial potential field. This project investigates the impact of combining deep Q-learning network with an artificial potential field, as proposed in [1], to achieve path planning for a robot formation. Specifically, it uses a deep Q-learning network to guide the robot formation to the target in an artificial potential field created by an environment with multiple targets and obstacles. Deep Q-learning network is a type of deep reinforcement learning, that is, it combines reinforcement learning and deep learning. As in [2], a black-hole potential field is also added in the artificial potential field. Simulation results show that deep Q- learning network can good results in the fixed artificial potential field. Out of 40 tests, a robot reaches the target without hitting any obstacles in 37 tests. However, the deep Q-learning network does not improve path planning performance in different artificial potential fields. During training in four different artificial potential fields, the robot can not achieve collision-free path planning in two of them. Out of four tests, the robot achieves collision-free path planning in three tests. Similarly, the robot formation successfully finds the target in the fixed artificial potential field. Out of eight tests, all tests are successful.