Pointer sentinel mixture models S Merity, C Xiong, J Bradbury, R Socher arXiv preprint arXiv:1609.07843, 2016 | 2269 | 2016 |
Regularizing and optimizing LSTM language models S Merity, NS Keskar, R Socher arXiv preprint arXiv:1708.02182, 2017 | 1333 | 2017 |
Dynamic memory networks for visual and textual question answering C Xiong, S Merity, R Socher International conference on machine learning, 2397-2406, 2016 | 904 | 2016 |
Quasi-recurrent neural networks J Bradbury, S Merity, C Xiong, R Socher arXiv preprint arXiv:1611.01576, 2016 | 635 | 2016 |
An analysis of neural language modeling at multiple scales S Merity, NS Keskar, R Socher arXiv preprint arXiv:1803.08240, 2018 | 191 | 2018 |
Dynamic memory network R Socher, A Kumar, O Irsoy, M Iyyer, C Xiong, S Merity, R Paulus US Patent 11,113,598, 2021 | 153 | 2021 |
Dynamic Memory Network R Socher, A Kumar, O Irsoy, M Iyyer, C Xiong, S Merity, R Paulus US Patent App. 15/170,884, 2016 | 150 | 2016 |
Pointer sentinel mixture architecture SJ Merity, C Xiong, J Bradbury, R Socher US Patent 10,565,493, 2020 | 119 | 2020 |
Quasi-recurrent neural network J Bradbury, SJ Merity, C Xiong, R Socher US Patent App. 15/420,710, 2018 | 119 | 2018 |
Quasi-recurrent neural network based encoder-decoder model J Bradbury, SJ Merity, C Xiong, R Socher US Patent 11,080,595, 2021 | 111 | 2021 |
Domain specific language for generation of recurrent neural network architectures SJ Merity, R Socher, J Bradbury, C Xiong US Patent 12,014,257, 2024 | 106 | 2024 |
Single headed attention RNN: Stop thinking with your head S Merity arXiv preprint arXiv:1911.11423, 2019 | 81 | 2019 |
Accurate argumentative zoning with maximum entropy models S Merity, T Murphy, JR Curran Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly …, 2009 | 54 | 2009 |
Revisiting activation regularization for language rnns S Merity, B McCann, R Socher arXiv preprint arXiv:1708.01009, 2017 | 52 | 2017 |
A flexible approach to automated RNN architecture generation M Schrimpf, S Merity, J Bradbury, R Socher arXiv preprint arXiv:1712.07316, 2017 | 23 | 2017 |
Scalable language modeling: Wikitext-103 on a single gpu in 12 hours S Merity, NS Keskar, J Bradbury, R Socher Proceedings of the SYSML 18, 2018 | 9 | 2018 |
The NUGGET Non-Linear Piecewise Activation S Merity SIGBOVIK 2018, 57, 2018 | 2 | 2018 |
Frontier Pruning for Shift-Reduce CCG Parsing S Merity, JR Curran Proceedings of the Australasian Language Technology Association Workshop …, 2011 | 2 | 2011 |
Integrated Tagging and Pruning via Shift-Reduce CCG Parsing S Merity School of Information Technologies The University of Sydney, Australia Nov 7, 2011 | 2 | 2011 |
Pointer sentinel mixture architecture SJ Merity, C Xiong, J Bradbury, R Socher US Patent 11,580,359, 2023 | | 2023 |