It is very difficult for visually impaired people to perceive and avoid obstacles at a distance. To address this problem, the unified framework of multiple target detection, recognition, and fusion is proposed based on the sensor fusion system comprising a low-power millimeter wave (MMW) radar and an RGB-Depth (RGB-D) sensor. In this paper, the Mask R-CNN and the single shot multibox detector network are utilized to detect and recognize the objects from color images. The obstacles’ depth information is obtained from the depth images using the MeanShift algorithm. The position and velocity information on the multiple target is detected by the MMW radar based on the principle of a frequency modulated continuous wave. The data fusion based on the particle filter obtains more accurate state estimation and richer information by fusing the detection results from the color images, depth images, and radar data compared with using only one sensor. The experimental results show that the data fusion enriches the detection results. Meanwhile, the effective detection range is expanded compared to using only the RGB-D sensor. Moreover, the data fusion results keep high accuracy and stability under diverse range and illumination conditions. As a wearable system, the sensor fusion system has the characteristics of versatility, portability, and cost-effectiveness.

