Flashattention: Fast and memory-efficient exact attention with io-awareness T Dao, D Fu, S Ermon, A Rudra, C Ré Advances in Neural Information Processing Systems 35, 16344-16359, 2022 | 1233 | 2022 |
Hungry hungry hippos: Towards language modeling with state space models DY Fu, T Dao, KK Saab, AW Thomas, A Rudra, C Ré The Eleventh International Conference on Learning Representations, 2023 | 299 | 2023 |
Flexgen: High-throughput generative inference of large language models with a single gpu Y Sheng, L Zheng, B Yuan, Z Li, M Ryabinin, B Chen, P Liang, C Ré, ... International Conference on Machine Learning, 31094-31116, 2023 | 226 | 2023 |
Hyena hierarchy: Towards larger convolutional language models M Poli, S Massaroli, E Nguyen, DY Fu, T Dao, S Baccus, Y Bengio, ... International Conference on Machine Learning, 28043-28078, 2023 | 212 | 2023 |
Fast and three-rious: Speeding up weak supervision with triplet methods D Fu, M Chen, F Sala, S Hooper, K Fatahalian, C Ré International conference on machine learning, 3280-3291, 2020 | 129 | 2020 |
Rekall: Specifying video events using compositions of spatiotemporal labels DY Fu, W Crichton, J Hong, X Yao, H Zhang, A Truong, A Narayan, ... arXiv preprint arXiv:1910.02993, 2019 | 60 | 2019 |
Simple hardware-efficient long convolutions for sequence modeling DY Fu, EL Epstein, E Nguyen, AW Thomas, M Zhang, T Dao, A Rudra, ... International Conference on Machine Learning, 10373-10391, 2023 | 47 | 2023 |
Perfectly balanced: Improving transfer and robustness of supervised contrastive learning M Chen, DY Fu, A Narayan, M Zhang, Z Song, K Fatahalian, C Ré International Conference on Machine Learning, 3090-3122, 2022 | 46 | 2022 |
Multi-resolution weak supervision for sequential data P Varma, F Sala, S Sagawa, J Fries, D Fu, S Khattar, A Ramamoorthy, ... Advances in Neural Information Processing Systems 32, 2019 | 40 | 2019 |
Monarch mixer: A simple sub-quadratic gemm-based architecture D Fu, S Arora, J Grogan, I Johnson, ES Eyuboglu, A Thomas, B Spector, ... Advances in Neural Information Processing Systems 36, 2024 | 35 | 2024 |
Shoring up the foundations: Fusing model embeddings and weak supervision MF Chen, DY Fu, D Adila, M Zhang, F Sala, K Fatahalian, C Ré Uncertainty in Artificial Intelligence, 357-367, 2022 | 30* | 2022 |
Analysis of faces in a decade of us cable tv news J Hong, W Crichton, H Zhang, DY Fu, J Ritchie, J Barenholtz, B Hannel, ... KDD'21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery …, 2021 | 24* | 2021 |
Laughing hyena distillery: Extracting compact recurrences from convolutions S Massaroli, M Poli, D Fu, H Kumbong, R Parnichkun, D Romero, ... Advances in Neural Information Processing Systems 36, 2024 | 17 | 2024 |
Flashfftconv: Efficient convolutions for long sequences with tensor cores DY Fu, H Kumbong, E Nguyen, C Ré arXiv preprint arXiv:2311.05908, 2023 | 10 | 2023 |
Tabi: Type-aware bi-encoders for open-domain entity retrieval M Leszczynski, DY Fu, MF Chen, C Ré arXiv preprint arXiv:2204.08173, 2022 | 10 | 2022 |
Flashattention: Fast and memory-efficient exact attention with io-awareness (2022) T Dao, DY Fu, S Ermon, A Rudra, C Ré arXiv preprint arXiv:2205.14135, 0 | 9 | |
Benchmarking and building long-context retrieval models with loco and m2-bert J Saad-Falcon, DY Fu, S Arora, N Guha, C Ré arXiv preprint arXiv:2402.07440, 2024 | 8 | 2024 |
Hydragen: High-Throughput LLM Inference with Shared Prefixes J Juravsky, B Brown, R Ehrlich, DY Fu, C Ré, A Mirhoseini arXiv preprint arXiv:2402.05099, 2024 | 8 | 2024 |
Orexinergic neurotransmission in temperature responses to methamphetamine and stress: mathematical modeling as a data assimilation approach A Behrouzvaziri, D Fu, P Tan, Y Yoo, MV Zaretskaia, DE Rusyniak, ... PLoS One 10 (5), e0126719, 2015 | 8 | 2015 |
Automatic parallelization of sequential programs P Kraft, A Waterland, DY Fu, A Gollamudi, S Szulanski, M Seltzer arXiv preprint arXiv:1809.07684, 2018 | 5 | 2018 |