Connectionists: Recent benchmark records through LSTM RNNs

Juergen Schmidhuber juergen at idsia.ch
Thu Oct 9 05:31:56 EDT 2014


Some recent (2014) benchmark records achieved with the help of 
Long Short-Term Memory (LSTM) Recurrent Neural Networks (RNNs),
often at big IT companies:

1. Large vocabulary speech recognition (Sak et al., Google, Interspeech 2014)
2. English to French translation (Sutskever et al., Google, NIPS 2014)
3. Text-to-speech synthesis (Fan et al., Microsoft, Interspeech 2014)
4. Prosody contour prediction (Fernandez et al., IBM, Interspeech 2014)
5. Language identification (Gonzalez-Dominguez et al., Google, Interspeech 2014)
6. Medium vocabulary speech recognition (Geiger et al., Interspeech 2014)
7. Audio onset detection (Marchi et al., ICASSP 2014)
8. Social signal classification (Brueckner & Schulter, ICASSP 2014)
9. Arabic handwriting recognition (Bluche et al., DAS 2014)

Some earlier benchmark records of 2013:
TIMIT phoneme recognition (Graves et al., ICASSP 2013)
Optical character recognition (Breuel et al., ICDAR 2013)

Precise references and a summary of previous work on LSTM RNNs in Sec. 5.13 of:

Deep Learning in Neural Networks: An Overview (88 pages, 888 references)
PDF & LATEX source & complete public BIBTEX file (888 kB) under
http://www.idsia.ch/~juergen/deep-learning-overview.html

Original papers on LSTM and its various topologies and learning algorithms since 1995:
http://www.idsia.ch/~juergen/rnn.html

Juergen Schmidhuber


More information about the Connectionists mailing list