Pérez De San Roman P, Benois-Pineau J, Domenger J-P, Cattaert D, Paclet F, de Rugy A (2017)

Saliency Driven Object Recognition in Egocentric Videos with Deep CNN: toward application in assistance to Neuroprostheses. Computer Vision and Image Understanding. doi: 10.1016/j.cviu.2017.03.001.

`The problem of object recognition in natural scenes has been recently successfully addressed with Deep Convolutional Neuronal Networks giving a significant break-through in recognition scores. The computa- tional efficiency of Deep CNNs as a function of their depth, allows for their use in real-time applications. One of the key issues here is to reduce the number of windows selected from images to be submitted to a Deep CNN. This is usually solved by preliminary segmentation and selection of specific windows, hav- ing outstanding “objectiveness”or other value of indicators of possible location of objects. In this paper we propose a Deep CNN approach and the general framework for recognition of objects in a real-time scenario and in an egocentric perspective. Here the window of interest is built on the basis of visual attention map computed over gaze fixations measured by a glass-worn eye-tracker. The application of this set-up is an interactive user-friendly environment for upper-limb amputees. Vision has to help the subject to control his worn neuro-prosthesis in case of a small amount of remaining muscles when the EMG control becomes inefficient. The recognition results on a specifically recorded corpus of 151 videos with simple geometrical objects show the mean Average Precision (mAP) of 64,6% and the computational time at the generalization lower than a time of a visual fixation on the object of interest.