Thursday, September 9, 2010

Reading #2: Specifying Gestures by Example (Rubine)

Summary

This paper introduces a toolkit for creating gesture based UIs called GRANDMA. The toolkit lets a user to add gesture commands to different entities in the UI. For instance, user can add a delete gesture to the main UI windows and define the action(s) which must be executed upon its recognition. Each gesture is defined with a set of examples.
The system learns to distinguish between gestures by learning a linear classifier on a set of features extracted from the stroke. First, the gestures are assumed to be single-stroke. Second, for performance reasons, the features are constrained to only those which can be computed incrementally in constant time per point. A set of 13 features have been introduced in the paper such as total angular movement, maximum pen speed, size of the bounding box, etc. Consequently, a linear classifier is constructed based on the examples provided. Once the classification is ambiguous, either based on a low probability of being classified in the recognized class or for being far from the class centroid, the system rejects the gesture and asks the user to retry.
The precision of the system for classifying about 15 gestures, each with at least 15 examples is around 98%. For 30 gestures where each gesture is trained with 40 examples, the precision is about 97%. The author also finally proposes some ideas to handle multi-finger gestures

Discussion

The system proposed has promising performance despite a rather simple recognition method. It is also possible to improve the system by providing new features without impairing the performance. However the author has only reported the precision of the system where false negative responses are not taken into account. Another area where one can improve is the fact that most features selected are not rotation and scale invariant which might bring difficulty in some domains. On the whole, the system is simple and performs well.

1 comment:

  1. You mean combine the rotation and resample function used in $1 with Rubine's linear recognizer? I think the author should focus on how each feature influence the entire recognition accuracy, not just selecting the features because feeling they are important.

    ReplyDelete