Sunday, November 28, 2010

#Reading 8: A Lightweight Multistroke Recognizer for User Interface Prototypes

Summary

This paper introduces $N which is an extension of $1 sketch recognizer. The objective of the paper is to introduce a lightweight and quickly deployable multi stroke recognizer for interface creators.
The method build on top of $1 method. In $1, the candidate stroke and the templates strokes’ points are all normalized equidistantly. Next, they are rotated, scaled and translated to provide invariance. The rotation matching uses Golden Section Search to find the optimum angle in the proximity of an initial estimate. Finally the distance between a candidate and a template is the sum of their point-wise distances.
In $N the scaling preprocessing decides whether to scale uniformly or non-uniformly each gesture based its aspect ratio and a threshold. The rotation invariance is also bounded to a user-defined angle range.
A problem with multi stroke gestures is different possible stroke orders. $N generates all possible variations of drawing a template which is all order permutations and for each permutation, each single stroke can be drawn forward or backward which results in 2nxn! different cases for a multi stroke with n strokes. As a speed optimization, the candidate gesture is not compared with all the generated gestures, but with the ones having almost the same starting angle (the angle between the gesture’s centroid and its starting point). Another optimization was to only compare templates having the same number of components as that of the candidate gesture. The two optimizations resulted in about 90% increase in speed.
A group of middle and high school students were studied and the accuracy of the system reached 96.6% on algebra symbols with 15 training examples. The accuracy using the speed optimizations was 95.7% with 90% increase in speed.

Discussion

The number of possible ways of drawing a multi stroke gesture grows very rapidly with the number of its sub strokes. However the authors argue that it might not always result in combinatorial explosion. First, because gestures having a many strokes are prone to be forgotten and are not used.Second, comparison between a candidate gesture and each instance is relatively fast. Third, employed speed optimizations.
Nevertheless, the relative high number of generated gestures will cause slowdowns when the number of templates and stroke components per template is high. The performance is however acceptable for prototyping where ease of implementation and accuracy is important for smaller scale scenarios.

No comments:

Post a Comment