Obserwuj
Daniel Simig
Daniel Simig
Cohere
Zweryfikowany adres z cohere.com
Tytuł
Cytowane przez
Cytowane przez
Rok
Opt: Open pre-trained transformer language models
S Zhang, S Roller, N Goyal, M Artetxe, M Chen, S Chen, C Dewan, ...
arXiv preprint arXiv:2205.01068, 2022
18832022
Few-shot learning with multilingual language models
XV Lin, T Mihaylov, M Artetxe, T Wang, S Chen, D Simig, M Ott, N Goyal, ...
arXiv preprint arXiv:2112.10668, 2021
151*2021
Opt-iml: Scaling language model instruction meta learning through the lens of generalization
S Iyer, XV Lin, R Pasunuru, T Mihaylov, D Simig, P Yu, K Shuster, T Wang, ...
arXiv preprint arXiv:2212.12017, 2022
682022
Semdedup: Data-efficient learning at web-scale through semantic deduplication
A Abbas, K Tirumala, D Simig, S Ganguli, AS Morcos
arXiv preprint arXiv:2303.09540, 2023
672023
Megabyte: Predicting million-byte sequences with multiscale transformers
L Yu, D Simig, C Flaherty, A Aghajanyan, L Zettlemoyer, M Lewis
Advances in Neural Information Processing Systems 36, 2024
542024
D4: Improving llm pretraining via document de-duplication and diversification
K Tirumala, D Simig, A Aghajanyan, A Morcos
Advances in Neural Information Processing Systems 36, 2024
352024
Understanding in-context learning via supportive pretraining data
X Han, D Simig, T Mihaylov, Y Tsvetkov, A Celikyilmaz, T Wang
arXiv preprint arXiv:2306.15091, 2023
232023
Open vocabulary extreme classification using generative models
D Simig, F Petroni, P Yanki, K Popat, C Du, S Riedel, M Yazdani
arXiv preprint arXiv:2205.05812, 2022
132022
Text characterization toolkit (TCT)
D Simig, T Wang, V Dankers, P Henderson, K Batsuren, D Hupkes, ...
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the …, 2022
7*2022
MEGABYTE: modeling million-byte sequences with multiscale transformers
L Yu, D Simig, C Flaherty, A Aghajanyan, L Zettlemoyer, M Lewis
Proceedings of the 37th International Conference on Neural Information …, 2023
2023
Evaluating end-to-end entity linking on domain-specific knowledge bases: Learning about ancient technologies from museum collections
S Cadavid-Sanchez, K Kacem, RAM Frade, J Boehm, T Chaney, ...
arXiv preprint arXiv:2305.14588, 2023
2023
Turning Flows into Trees: Graph Analytics for Aerodynamic Flows
D Simig, P Kelly
2016
Natural Language to Neural Programs
D Simig
Nie można teraz wykonać tej operacji. Spróbuj ponownie później.
Prace 1–13