Jared Casper
Jared Casper
Research Scientist, NVIDIA
Verified email at
Cited by
Cited by
Deep speech 2: End-to-end speech recognition in english and mandarin
D Amodei, S Ananthanarayanan, R Anubhai, J Bai, E Battenberg, C Case, ...
International conference on machine learning, 173-182, 2016
Deep speech: Scaling up end-to-end speech recognition
A Hannun, C Case, J Casper, B Catanzaro, G Diamos, E Elsen, ...
arXiv preprint arXiv:1412.5567, 2014
Megatron-lm: Training multi-billion parameter language models using model parallelism
M Shoeybi, M Patwary, R Puri, P LeGresley, J Casper, B Catanzaro
arXiv preprint arXiv:1909.08053, 2019
An effective hybrid transactional memory system with strong isolation guarantees
CC Minh, M Trautmann, JW Chung, A McDonald, N Bronson, J Casper, ...
Proceedings of the 34th annual international symposium on Computer …, 2007
A practical concurrent binary search tree
NG Bronson, J Casper, H Chafi, K Olukotun
ACM Sigplan Notices 45 (5), 257-268, 2010
The vector-thread architecture
R Krashinsky, C Batten, M Hampton, S Gerding, B Pharris, J Casper, ...
ACM SIGARCH Computer Architecture News 32 (2), 52, 2004
Hardware acceleration of database operations
J Casper, K Olukotun
Proceedings of the 2014 ACM/SIGDA international symposium on Field …, 2014
A scalable, non-blocking approach to transactional memory
H Chafi, J Casper, BD Carlstrom, A McDonald, CC Minh, W Baek, ...
2007 IEEE 13th International Symposium on High Performance Computer …, 2007
Using deepspeed and megatron to train megatron-turing nlg 530b, a large-scale generative language model
S Smith, M Patwary, B Norick, P LeGresley, S Rajbhandari, J Casper, ...
arXiv preprint arXiv:2201.11990, 2022
Efficient large-scale language model training on gpu clusters using megatron-lm
D Narayanan, M Shoeybi, J Casper, P LeGresley, M Patwary, ...
Proceedings of the International Conference for High Performance Computing …, 2021
Eigenbench: A simple exploration tool for orthogonal TM characteristics
S Hong, T Oguntebi, J Casper, N Bronson, C Kozyrakis, K Olukotun
IEEE International Symposium on Workload Characterization (IISWC'10), 1-11, 2010
A practical FPGA-based framework for novel CMP research
S Wee, J Casper, N Njoroge, Y Tesylar, D Ge, C Kozyrakis, K Olukotun
Proceedings of the 2007 ACM/SIGDA 15th international symposium on Field …, 2007
Atlas: A chip-multiprocessor with transactional memory support
N Njoroge, J Casper, S Wee, Y Teslyar, D Ge, C Kozyrakis, K Olukotun
2007 Design, Automation & Test in Europe Conference & Exhibition, 1-6, 2007
Bloom: A 176b-parameter open-access multilingual language model
TL Scao, A Fan, C Akiki, E Pavlick, S Iliĉ, D Hesslow, R Castagné, ...
arXiv preprint arXiv:2211.05100, 2022
Transactional predication: high-performance concurrent sets and maps for stm
NG Bronson, J Casper, H Chafi, K Olukotun
Proceedings of the 29th ACM SIGACT-SIGOPS symposium on Principles of …, 2010
Systems and methods for speech transcription
A Hannun, C Case, J Casper, B Catanzaro, G Diamos, E Elsen, ...
US Patent 10,540,957, 2020
Hardware acceleration of transactional memory on commodity systems
J Casper, T Oguntebi, S Hong, NG Bronson, C Kozyrakis, K Olukotun
ACM SIGPLAN Notices 46 (3), 27-38, 2011
FARM: A prototyping environment for tightly-coupled, heterogeneous architectures
T Oguntebi, S Hong, J Casper, N Bronson, C Kozyrakis, K Olukotun
2010 18th IEEE Annual International Symposium on Field-Programmable Custom …, 2010
Building and using the atlas transactional memory system
N Njoroge, S Wee, J Casper, J Burdick, Y Teslyar, C Kozyrakis, ...
Proc. 2nd Workshop on Architecture Research using FPGA Platforms, Austin …, 2006
Reducing activation recomputation in large transformer models
V Korthikanti, J Casper, S Lym, L McAfee, M Andersch, M Shoeybi, ...
arXiv preprint arXiv:2205.05198, 2022
The system can't perform the operation now. Try again later.
Articles 1–20