Follow
Yury Zemlyanskiy
Title
Cited by
Cited by
Year
GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints
J Ainslie, J Lee-Thorp, M de Jong, Y Zemlyanskiy, F Lebrón, S Sanghai
arXiv preprint arXiv:2305.13245, 2023
732023
Self-attentive, multi-context one-class classification for unsupervised anomaly detection on text
L Ruff, Y Zemlyanskiy, R Vandermeulen, T Schnake, M Kloft
ACL 2019, 2019
672019
Colt5: Faster long-range transformers with conditional computation
J Ainslie, T Lei, M de Jong, S Ontañón, S Brahma, Y Zemlyanskiy, ...
arXiv preprint arXiv:2303.09752, 2023
352023
Extracting translation pairs from social network content
M Eck, Y Zemlyanskiy, J Zhang, A Waibel
IWSLT 2014, 2014
35*2014
Mention Memory: incorporating textual knowledge into Transformers through entity mention attention
M de Jong, Y Zemlyanskiy, N FitzGerald, F Sha, W Cohen
ICLR 2022, 2022
272022
Aiming to Know You Better Perhaps Makes Me a More Engaging Dialogue Partner
Y Zemlyanskiy, F Sha
CoNLL 2018, 2018
222018
FiDO: Fusion-in-Decoder optimized for stronger performance and faster inference
M de Jong, Y Zemlyanskiy, J Ainslie, N FitzGerald, S Sanghai, F Sha, ...
arXiv preprint arXiv:2212.08153, 2022
172022
ReadTwice: Reading Very Large Documents with Memories
Y Zemlyanskiy, J Ainslie, M de Jong, P Pham, I Eckstein, F Sha
NAACL-HLT 2021, 2021
132021
Generate-and-Retrieve: use your predictions to improve retrieval for semantic parsing
Y Zemlyanskiy, M de Jong, J Ainslie, P Pasupat, P Shaw, L Qiu, ...
COLING 2022, 2022
92022
DOCENT: Learning Self-Supervised Entity Representations from Large Document Collections
Y Zemlyanskiy, S Gandhe, R He, B Kanagal, A Ravula, J Gottweis, F Sha, ...
EACL 2021, 2021
82021
Pre-computed memory or on-the-fly encoding? A hybrid approach to retrieval augmentation makes the most of your compute
M De Jong, Y Zemlyanskiy, N FitzGerald, J Ainslie, S Sanghai, F Sha, ...
ICML 2023, 2023
52023
GLIMMER: generalized late-interaction memory reranker
M de Jong, Y Zemlyanskiy, N FitzGerald, S Sanghai, WW Cohen, J Ainslie
arXiv preprint arXiv:2306.10231, 2023
22023
Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models
L Vilnis, Y Zemlyanskiy, P Murray, AT Passos, S Sanghai
ICML 2023, 2023
22023
MEMORY-VQ: Compression for Tractable Internet-Scale Memory
Y Zemlyanskiy, M de Jong, L Vilnis, S Ontañón, WW Cohen, S Sanghai, ...
arXiv preprint arXiv:2308.14903, 2023
2023
The system can't perform the operation now. Try again later.
Articles 1–14