Monday, November 24, 2014

Fitts' law as a research and design tool in human-computer interaction (Paper Review)








Bibliography:

MacKenzie, I. Scott. "Fitts' law as a research and design tool in human-computer interaction." Human-computer interaction 7.1 (1992): 91-139.


Link: 


Summary:

This paper analyzes how Fitts' law has been used and the results of subsequent research using Fitts' law or a modified form of it.


Comments:

I'm taking Embodiedment Interaction and seems like this model might fit in with this paradigm.


Research Ideas:

I would be interested to see if using 3D hand gestures could be analyzed using this model.


Wednesday, November 19, 2014

Using Entropy to Distinguish Shape Versus Text in Hand-Drawn Diagrams (Paper Report)



Bibliography:


Bhat, Akshay, and Tracy Hammond. "Using Entropy to Distinguish Shape Versus Text in Hand-Drawn Diagrams." IJCAI. Vol. 9. 2009.

Link:


http://www.aaai.org/ocs/index.php/IJCAI/IJCAI-09/paper/download/592/906

Summary:


Entropy is used, in the sense of information theory, to classify text from shapes. In this sense text is thought to be more random, have greater entropy, than shapes. To encode the entropy of a stroke the authors use an alphabet which consists of seven letters which describe the angle of a stroke point with its two temporal neighbors (one for the endpoints.) Some preprocessing is done by normalizing the stroke, resampling so that points are equidistant from each other. Strokes are also grouped together according to some threshold in the spatial and temporal dimension. Strokes that are close together in time and space should belong to the same class (text or shape.) Finally probability of given 'letter' is used to calculate the entropy according to the following formula:


To calculate the confidence the authors use:


where b is the confidence of classification of TEXT is 0.5.

They trained on COA drawings and tested classification on mechanics drawings with favorable results. 

Comments:


I think this approach is very clever.  I'm interested in how entropy differs between different users from different locales.

Research Ideas:


Find out what other domains might entropy be relevant in classification.

Wednesday, October 22, 2014

Combining Corners from Multiple Segmenters (Paper Report)











Bibliography:
Wolin, Aaron, Martin Field, and Tracy Hammond. "Combining corners from multiple segmenters." Proceedings of the Eighth Eurographics Symposium on Sketch-Based Interfaces and Modeling. ACM, 2011.

Link: 

Summary:

In this paper the authors implement a corner subset selection with corners detected by five algorithms from different papers (ShortStraw, Doublas-Pecker, Paleo, Sergin, and Kim). In their system the stroke finds all the corners detected from the algorithms and combines them into a master set removing duplicates.

The cull this set using a sequential floating backwards selection. It is a greedy algorithm. From this set, they iteratively remove points and measure the error difference between current and previous sets of points. The point removed is the point that makes the least difference in the error metric. Points are floating so they can be added back in. The error is measured as the squared vertical distance from the original stroke to the current set of proposed polylines. To get the preferred set they used the ratio of the difference of errors and determined when this ratio changed the most, i.e. passed a certain threshold.

The threshold was determined from a training set where the number of corners was known. The ran the algorithm and produced collections of strokes for when the ratio of error differences reflected when the number of points overdetermined the corners (false positives) and when the number of points underdetermined the corners (false negatives.) They used the MAD or meadian absolute deviation of each collection to find the dividing point for the threshold, in between the two.

Comments:

I thought this was a very interesting approach to corner finding, using a combination of other corner finding algorithms, and the fact that the all or nothing accuracy was higher than any other algorithm alone.

Research Ideas:


Monday, October 20, 2014

Sketch Based Interfaces: Early Processing for Sketch Understanding (Paper Report)



Bibliography:


Sezgin, Tevfik Metin, Thomas Stahovich, and Randall Davis. "Sketch based interfaces: early processing for sketch understanding." ACM SIGGRAPH 2006 Courses. ACM, 2006.


Link:


http://dl.acm.org/citation.cfm?id=1185783


Summary:


In this paper the authors present a system designed to be a natural sketch interface for the design of mechanical-engineering drawings. It was designed to recognize what is drawn instead of how something is drawn.

The system follows three steps to create its drawings: approximation, beautification, and basic recognition of the stroke.

To approximate strokes it tries to detect the vertices of the shape. This is done by doing some preprocessing of the stroke. The direction, curvature, and speed graphs of the stroke are computed and used to determine the which points are vertices. Vertices are usually found at the peaks of high curvature. It has also been observed that users slow down at corners, which is signified by valleys in the speed graph. To factor out the noise from the stroke, peaks and valleys are only considered if above/below a certain threshold. These where empirically found to be the mean of curvature graph and %90 of the mean of the speed graph. The mean is used so that the threshold is dependent on the graph data instead of it being a global value. The threshold splits the graph into regions and the extrema are chosen from the satisfying regions.

Usually taking only curvature into consideration is insufficient, and both speed and curvature are taken into account. The certainty that points are vertices are calculated by find the scaled magnitude of curvature in a neighborhood for those found in the curvature graph and the ratio of speed from the speed graph. The initial hybrid fit is the intersection of points that are both peaks and valleys in the curvature and speed graphs respectively. Subsequent hybrid fits are made by adding the next best speed and curvature fits. The error of the fit by finding the orthogonal squared distance error of the stroke to the fit. Points are added until an error threshold is reached and the best fit with the least number of vertices is chosen.

Curves are found by calculating the ratio of arc length between vertices to the euclidean distance between vertices with those with a high enough ration being deemed curves. Arcs are approximated with Bezier curves that are approximated by linear segments. It is subdivided until the squared distance error is below some threshold.

Beautification is done by rotating line segments around their midpoints when they are 'supposed to be linear,' to match nearby segments. This is done by comparing segments to others within a window.

Finally basic recognition is done by testing distance to its bounding box for squares and squared distance error measures for fitted ellipses.

A high level recognizer was implemented to recognize domain specific shapes. Also over-tracing was addressed.

Comments: 


It seems as though how something is drawn,  slowing down before taking corners, can still aid in recognition. Also comparing the arc length and euclidian distance for finding corners is very similar to that of Short Straw's.

Research Ideas:


I wonder what other behavioral features can used to recognize sketches.

Short Straw

Monday, October 13, 2014

Gestures without Libraries, Toolkits or Training: A $1 Recognizer for User Interface Prototypes (Paper Report)



Bibliography:

Wobbrock, Jacob O., Andrew D. Wilson, and Yang Li. "Gestures without libraries, toolkits or training: a $1 recognizer for user interface prototypes."Proceedings of the 20th annual ACM symposium on User interface software and technology. ACM, 2007.

Link:

http://dl.acm.org/citation.cfm?id=1294238

Summary:

This paper introduces the $1 recognizer which is "easy, cheap, and usable almost anywhere in about 100 lines of code." It aims to have an accessible recognizer for novice programmers to use in UI prototypes. They compare their recognizer against the Dynamic Time Warping recognizer and Rubine classifiers. The $1 recognizer performs just as well as DTW and better than Rubine.

To compare a candidate gesture C to a template gesture Ti they use the distance between each corresponding sample point between C and Ti and create a score based on the minimum path-distance.Before scoring the candidate and template gestures are normalized.

The $1 recognizer uses a 4 step process. First they resample the point path and normalize it in respect to stroke length so that each sample is the same distance apart. Second they rotate the gesture so that the indicative angle is zero degrees from the horizontal.  The indicative angle is given by the angle of the line joining the centroid to the first point of the gesture. Third they non-uniformly scale the gesture to a reference square and translate it so that its centroid lies at (0, 0).  This ensures that differences between candidate and template points are only due to rotation and not aspect ratio. Finally they rotate the candidate gesture until they find the best score / global minimum of path-distance. Instead of the searching the entire angular space they use the Golden Section Search (GSS) strategy which minimizes the cost of searches between dissimilar gestures.

Since it is scale, rotation, and position invariant the $1 recognizer has limits on recognizing gestures that depend on scale, rotation, and position. The invariance can be removed on a per gesture basis if required.

Comments:

I thought it was pretty amazing This reminds me of the SIFT paper in computer vision for tracking features which is also scale, rotation, and translation invariant. Instead of point distances they compare image gradients in a normalized orientated window.

Research Ideas:

I wonder what other computer vision feature tracking algorithms would be useful in gesture recognition.




Wednesday, October 8, 2014

PaleoSketch: Accurate Primitive Sketch Recognition and Beautification (Paper Report)







Bibliography:
Paulson, Brandon, and Tracy Hammond. "Paleosketch: accurate primitive sketch recognition and beautification." Proceedings of the 13th international conference on Intelligent user interfaces. ACM, 2008.


Link:
http://dl.acm.org/citation.cfm?id=1378775

Summary:

This paper presents a system that improves the recognition of primitives shapes and the complex shapes based on those primitive shapes.  The shapes recognized are lines, poly-lines,  circles, ellipses, arcs, curves, spirals, and helixes, and recognition was %98.56.  It introduces two new features and a ranking system for shape recognition. The two new features are NDDE ( Normalized Distance between Direction Extremes ) and DCR ( Direction Change Ratio ). NDDE measures the length between the smallest and greatest direction values ( change in y over change in x ) and DCR is the ratio between the max change in direction over the average change in direction.


Comments:



Research Ideas: