Enabling My Robot To Play Pictionary
Aug 16, 2016
The most approaches for hand-drawn sketch recognition usually ignore the sequential aspect of freehand sketching, or exploit it ad hoc. Researchers from the Indian Institute of Science have recently proposed a recurrent neural network architecture for sketch-object recognition. Their method exploits the long-term sequential and structural regularities in stroke data. Their framework is inherently online and suited for the on-the-fly recognition of objects while they are being drawn.
The proliferation of pen and tablet devices today enables us to capture the entire process of sketching. Their method recognizes sketches across over 150 object categories. It has been shown that the choice of deep-sketch features and recurrent network architecture both play a crucial role in gaining good recognition performance.
Overview of the sketch-recognition framework. Deep features (x1, x2…xt…) from each cumulative stroke image form the input to the Gated Recurrent Unit (cyan) and obtains the prediction sequence (y1, y2…yt…). The per-timestamp loss lt is computed with respect to the ground truth (green arrow) by the loss function (yellow box), and the loss is weighted by a corresponding wt (proportional grey circles) and backpropagated (purple arrows) for the corresponding time-step t.
The design accounts for the inherently sequential and cumulative nature of human sketching process in a natural way, and it exploits the long-term sequential and structural regularities in stroke data as deep features. These traits enable the system to achieve state-of-the-art recognition results on a large database of freehand object sketches.
- Sarvadevabhatla, Ravi Kiran et al. (2016): Enabling My Robot To Play Pictionary: Recurrent Neural Networks For Sketch Recognition, at: arxiv.org