Wednesday, October 22, 2014

Combining Corners from Multiple Segmenters (Paper Report)











Bibliography:
Wolin, Aaron, Martin Field, and Tracy Hammond. "Combining corners from multiple segmenters." Proceedings of the Eighth Eurographics Symposium on Sketch-Based Interfaces and Modeling. ACM, 2011.

Link: 

Summary:

In this paper the authors implement a corner subset selection with corners detected by five algorithms from different papers (ShortStraw, Doublas-Pecker, Paleo, Sergin, and Kim). In their system the stroke finds all the corners detected from the algorithms and combines them into a master set removing duplicates.

The cull this set using a sequential floating backwards selection. It is a greedy algorithm. From this set, they iteratively remove points and measure the error difference between current and previous sets of points. The point removed is the point that makes the least difference in the error metric. Points are floating so they can be added back in. The error is measured as the squared vertical distance from the original stroke to the current set of proposed polylines. To get the preferred set they used the ratio of the difference of errors and determined when this ratio changed the most, i.e. passed a certain threshold.

The threshold was determined from a training set where the number of corners was known. The ran the algorithm and produced collections of strokes for when the ratio of error differences reflected when the number of points overdetermined the corners (false positives) and when the number of points underdetermined the corners (false negatives.) They used the MAD or meadian absolute deviation of each collection to find the dividing point for the threshold, in between the two.

Comments:

I thought this was a very interesting approach to corner finding, using a combination of other corner finding algorithms, and the fact that the all or nothing accuracy was higher than any other algorithm alone.

Research Ideas:


Monday, October 20, 2014

Sketch Based Interfaces: Early Processing for Sketch Understanding (Paper Report)



Bibliography:


Sezgin, Tevfik Metin, Thomas Stahovich, and Randall Davis. "Sketch based interfaces: early processing for sketch understanding." ACM SIGGRAPH 2006 Courses. ACM, 2006.


Link:


http://dl.acm.org/citation.cfm?id=1185783


Summary:


In this paper the authors present a system designed to be a natural sketch interface for the design of mechanical-engineering drawings. It was designed to recognize what is drawn instead of how something is drawn.

The system follows three steps to create its drawings: approximation, beautification, and basic recognition of the stroke.

To approximate strokes it tries to detect the vertices of the shape. This is done by doing some preprocessing of the stroke. The direction, curvature, and speed graphs of the stroke are computed and used to determine the which points are vertices. Vertices are usually found at the peaks of high curvature. It has also been observed that users slow down at corners, which is signified by valleys in the speed graph. To factor out the noise from the stroke, peaks and valleys are only considered if above/below a certain threshold. These where empirically found to be the mean of curvature graph and %90 of the mean of the speed graph. The mean is used so that the threshold is dependent on the graph data instead of it being a global value. The threshold splits the graph into regions and the extrema are chosen from the satisfying regions.

Usually taking only curvature into consideration is insufficient, and both speed and curvature are taken into account. The certainty that points are vertices are calculated by find the scaled magnitude of curvature in a neighborhood for those found in the curvature graph and the ratio of speed from the speed graph. The initial hybrid fit is the intersection of points that are both peaks and valleys in the curvature and speed graphs respectively. Subsequent hybrid fits are made by adding the next best speed and curvature fits. The error of the fit by finding the orthogonal squared distance error of the stroke to the fit. Points are added until an error threshold is reached and the best fit with the least number of vertices is chosen.

Curves are found by calculating the ratio of arc length between vertices to the euclidean distance between vertices with those with a high enough ration being deemed curves. Arcs are approximated with Bezier curves that are approximated by linear segments. It is subdivided until the squared distance error is below some threshold.

Beautification is done by rotating line segments around their midpoints when they are 'supposed to be linear,' to match nearby segments. This is done by comparing segments to others within a window.

Finally basic recognition is done by testing distance to its bounding box for squares and squared distance error measures for fitted ellipses.

A high level recognizer was implemented to recognize domain specific shapes. Also over-tracing was addressed.

Comments: 


It seems as though how something is drawn,  slowing down before taking corners, can still aid in recognition. Also comparing the arc length and euclidian distance for finding corners is very similar to that of Short Straw's.

Research Ideas:


I wonder what other behavioral features can used to recognize sketches.

Short Straw

Monday, October 13, 2014

Gestures without Libraries, Toolkits or Training: A $1 Recognizer for User Interface Prototypes (Paper Report)



Bibliography:

Wobbrock, Jacob O., Andrew D. Wilson, and Yang Li. "Gestures without libraries, toolkits or training: a $1 recognizer for user interface prototypes."Proceedings of the 20th annual ACM symposium on User interface software and technology. ACM, 2007.

Link:

http://dl.acm.org/citation.cfm?id=1294238

Summary:

This paper introduces the $1 recognizer which is "easy, cheap, and usable almost anywhere in about 100 lines of code." It aims to have an accessible recognizer for novice programmers to use in UI prototypes. They compare their recognizer against the Dynamic Time Warping recognizer and Rubine classifiers. The $1 recognizer performs just as well as DTW and better than Rubine.

To compare a candidate gesture C to a template gesture Ti they use the distance between each corresponding sample point between C and Ti and create a score based on the minimum path-distance.Before scoring the candidate and template gestures are normalized.

The $1 recognizer uses a 4 step process. First they resample the point path and normalize it in respect to stroke length so that each sample is the same distance apart. Second they rotate the gesture so that the indicative angle is zero degrees from the horizontal.  The indicative angle is given by the angle of the line joining the centroid to the first point of the gesture. Third they non-uniformly scale the gesture to a reference square and translate it so that its centroid lies at (0, 0).  This ensures that differences between candidate and template points are only due to rotation and not aspect ratio. Finally they rotate the candidate gesture until they find the best score / global minimum of path-distance. Instead of the searching the entire angular space they use the Golden Section Search (GSS) strategy which minimizes the cost of searches between dissimilar gestures.

Since it is scale, rotation, and position invariant the $1 recognizer has limits on recognizing gestures that depend on scale, rotation, and position. The invariance can be removed on a per gesture basis if required.

Comments:

I thought it was pretty amazing This reminds me of the SIFT paper in computer vision for tracking features which is also scale, rotation, and translation invariant. Instead of point distances they compare image gradients in a normalized orientated window.

Research Ideas:

I wonder what other computer vision feature tracking algorithms would be useful in gesture recognition.




Wednesday, October 8, 2014

PaleoSketch: Accurate Primitive Sketch Recognition and Beautification (Paper Report)







Bibliography:
Paulson, Brandon, and Tracy Hammond. "Paleosketch: accurate primitive sketch recognition and beautification." Proceedings of the 13th international conference on Intelligent user interfaces. ACM, 2008.


Link:
http://dl.acm.org/citation.cfm?id=1378775

Summary:

This paper presents a system that improves the recognition of primitives shapes and the complex shapes based on those primitive shapes.  The shapes recognized are lines, poly-lines,  circles, ellipses, arcs, curves, spirals, and helixes, and recognition was %98.56.  It introduces two new features and a ranking system for shape recognition. The two new features are NDDE ( Normalized Distance between Direction Extremes ) and DCR ( Direction Change Ratio ). NDDE measures the length between the smallest and greatest direction values ( change in y over change in x ) and DCR is the ratio between the max change in direction over the average change in direction.


Comments:



Research Ideas:

Monday, October 6, 2014

What!?! No Rubin Features?: Using Geometric-based Features to Produce Normalized Confidence Values for Sketch Recognition (Paper Report)



Bibliography:

Paulson, Brandon, et al. "What!?! no Rubine features?: using geometric-based features to produce normalized confidence values for sketch recognition." HCC Workshop: Sketch Tools for Diagramming. 2008.

Link: 

http://srl.tamu.edu/srlng/research/paper/23?from=/srlng/research/

Summary:


In this paper the authors create and test a system that uses both gesture and geometric-based features for sketch recognition of complex shapes. Gesture recognition depends on how the user draws whereas geometric-based recognition depends on what the user draws. Their aim is to create a recognition system that was user-independent. This system was tested against the Paleosketch recognizer and the differences between the two were shown to be statistically insignificant. The classifier they used was a statistical classifier. The total angle feature from gesture recognition was the only significant feature chosen for the optimal subset of features.

Comments:

It is interesting that these two methods of classification can be combined and used so effectively, drawing on the strengths of both methods. I didn't understand the significance of the way the user groups were split and their effect on the validation. Also the way the subset of optimal features was found.

Research Ideas:

I wonder why there were so few gestural features included in the subset of optimal features.

Wednesday, October 1, 2014

Sketch Recognition Design Considerations and Improvements to Mechanix


10 Principles of Things a Sketch System Should Do:

  1. Be reliable.
  2. Have smooth feedback between where my pen is and what is drawn on screen.
  3. Give feed back if a gesture is or is not accepted.
  4. Have clear instructions.
  5. Have simple gestures.
  6. Adapt to different users.
  7. Gestures should try to be related to what they do if possible.
  8. Not too many gestures.
  9. Should be able to undo easily.
  10. Utilize the strengths of sketching.

10 Principles of Things a Sketch System Should Not Do:

  1. Should not lag.
  2. Have complicated gestures.
  3. Have too many gestures.
  4. Have not enough gestures to do what you want to do.
  5. Accept incorrect gestures / input.
  6. Try to use sketching for something that it would be terrible for.
  7. Have convoluted instructions.
  8. Make too many assumptions about the user.
  9. Reject correct gestures / input.
  10. Have too many gestures that don't relate well to what they do.

Five suggestions for improvement to Mechanix

  1. Improve client-server lag.
  2. Have more robust shape recognition. Shape creation was too dependent on node order and shapes I thought were correct were not recognized.
  3. Able to sketch the labels / letters.
  4. Accept gestures for labeling instead of rollout menu.
  5. Have a mask for selection or input. ...I'm drawing / erasing only (trusses, forces,... labels)

Visual Similarity of Pen Gestures (Paper Report)



Bibliography:

A. Chris Long, Jr., James A. Landay, Lawrence A. Rowe, and Joseph Michiels. 2000. Visual similarity of pen gestures. In Proceedings of the SIGCHI conference on Human Factors in Computing Systems (CHI '00). ACM, New York, NY, USA, 360-367. DOI=10.1145/332040.332458 http://doi.acm.org/10.1145/332040.332458

Link:

http://dl.acm.org/citation.cfm?id=332458

Summary:

This paper intended to find out why some gestures were similar while others were not in order to facilitate gesture design. Similar gestures may be easier to learn and remember. The gestures evaluated were single-stroke and iconic, similar to the Apple Newton MessagePad. Psychology research in the area of similarity of geometric shapes revealed that similarity could vary linearly or logarithmically based on such measurements such as area, width, height, tilt, ...etc, but the similarity metrics used not only differed from person to person, but could also differ from shape to shape.

Multi-dimensional scaling looks at a data set and tries to reduce the dimensionality of the set so that distances reflect the similarity and dissimilarity of the objects. When using MDS, the number of dimensions, the distance metric, and the meaning of the axes, are some of the issues that must be decided.

Two experiments were run with two different sets of gestures and two different sets of subjects. The first set of gestures were designed to be as different as possible from each other. Display of animated gestures were used instead of having the subjects draw them in order to have more participants and gestures even though drawing them would give more context in usage. Using regression analysis on the geometric features they were able to create a model to predict similarity of gestures. The second experiment used three sets of similar gestures and a fourth set containing gestures from the previous set. The second experiment was not as successful at predicting similarity.

Comments:

I thought this paper could be very helpful in gesture design and designing experiments for effective gesture design. The fact that many of the features of Rubine's paper were used could signify their importance in gesture classification. It was surprising that the second experiment did not do as well as the first.

Ideas for Research:

I was thinking of extending this experiment into 3D gestures in VR environments.