Generating wikipedia by summarizing long sequences PJ Liu, M Saleh, E Pot, B Goodrich, R Sepassi, L Kaiser, N Shazeer
arXiv preprint arXiv:1801.10198, 2018
524 2018 Tensor2tensor for neural machine translation A Vaswani, S Bengio, E Brevdo, F Chollet, AN Gomez, S Gouws, L Jones, ...
arXiv preprint arXiv:1803.07416, 2018
472 2018 Model-based reinforcement learning for atari L Kaiser, M Babaeizadeh, P Milos, B Osinski, RH Campbell, ...
arXiv preprint arXiv:1903.00374, 2019
456 2019 Mesh-tensorflow: Deep learning for supercomputers N Shazeer, Y Cheng, N Parmar, D Tran, A Vaswani, P Koanantakool, ...
Advances in neural information processing systems 31, 2018
212 2018 Palm: Scaling language modeling with pathways A Chowdhery, S Narang, J Devlin, M Bosma, G Mishra, A Roberts, ...
arXiv preprint arXiv:2204.02311, 2022
23 2022 Pathways: Asynchronous distributed dataflow for ML P Barham, A Chowdhery, J Dean, S Ghemawat, S Hand, D Hurt, M Isard, ...
Proceedings of Machine Learning and Systems 4, 2022
5 2022 Attention-based decoder-only sequence transduction neural networks NM Shazeer, LM Kaiser, E Pot, M Saleh, BD Goodrich, PJ Liu, R Sepassi
US Patent App. 16/759,690, 2020
1 2020 Scaling Up Models and Data with and A Roberts, HW Chung, A Levskaya, G Mishra, J Bradbury, D Andor, ...
arXiv preprint arXiv:2203.17189, 2022
2022 INTELLIGENT INVESTING RS Sepassi
Harvard University Cambridge Massachusetts, 2010
2010