Summary
In this work the objective is to distinguish text and shapes. The authors extract a set of features from various related previous works, their own set of features and features from newly available hardware such as pressure. They employed a set of 9 diagrams and asked 26 people to draw them. For each sketch 46 features where calculated. Next a decision tree was built recursively using Gini index on the data to partition the data into classes.They finally reached a set of key feature set to efficiently distinguish between the two classes. These were: Time to next stroke, speed till next stroke, distance from last stroke, distance to next stroke, bounding box width, perimeter to area, amount of ink inside and total angle.
They tested their classifier with two others, namely Microsoft Divider and InkKit. The average misclassification rate for texts and shapes in their system was reported about %9.8 which is significantly better than the average misclassification rate of the other two systems.
Discussion
As noted by the authors, half of the extracted key features were related to inter-stroke gaps which seems to be a deciding factor here.It would have also been better to compare the decision tree method applied in their system with other classification methods such as neural networks, nearest neighbor, etc.
I think that another issue related to relying on gaps is that each user may apply gaps differently. I wonder if they applied scaling to the sizes based on how big the actual text/shapes were. This would have some impacts on recognition.
ReplyDelete