Connectionists: New *sequence learning* data set available: MNIST digits as stroke sequences

Edwin de Jong jong.de.edwin at gmail.com
Wed Sep 14 16:16:33 EDT 2016


Dear colleagues,

A new data set for the study of sequence learning algorithms is available
as of today. The data set consists of pen stroke sequences that represent
handwritten digits, and was created based on the MNIST handwritten digit
data set.

MNIST stroke sequence data set:
https://github.com/edwin-de-jong/mnist-digits-stroke-sequence-data/wiki/MNIST-digits-stroke-sequence-data

The code project that was used to create the data set is available as well:
https://github.com/edwin-de-jong/mnist-digits-as-stroke-sequences/wiki/MNIST-digits-as-stroke-sequences-(code)

The 70000 digit images were thresholded and thinned, yielding skeletons of
the images. Using a TSP algorithm, hypothetical pen stroke sequences were
then inferred. The resulting data set provides a sizeable and diverse test
bed, and can serve as a benchmark data set for evaluating and comparing
sequence learning algorithms.

Further details can be found at the links above; please feel free to
contact me in case of any questions or suggestions.

Best regards,

Dr. Edwin D. de Jong
__
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/connectionists/attachments/20160914/b03e1ea1/attachment.html>


More information about the Connectionists mailing list