Scaling distributed machine learning with the parameter server M Li, DG Andersen, JW Park, AJ Smola, A Ahmed, V Josifovski, J Long, ... 11th USENIX Symposium on operating systems design and implementation (OSDI …, 2014 | 2141 | 2014 |
Efficient, high-quality image contour detection B Catanzaro, BY Su, N Sundaram, Y Lee, M Murphy, K Keutzer 2009 IEEE 12th International Conference on Computer Vision, 2381-2388, 2009 | 182 | 2009 |
clSpMV: A cross-platform OpenCL SpMV framework on GPUs BY Su, K Keutzer Proceedings of the 26th ACM international conference on Supercomputing, 353-364, 2012 | 173 | 2012 |
Deep learning training in facebook data centers: Design of scale-up and scale-out systems M Naumov, J Kim, D Mudigere, S Sridharan, X Wang, W Zhao, S Yilmaz, ... arXiv preprint arXiv:2003.09518, 2020 | 83 | 2020 |
Routability-driven analytical placement by net overlapping removal for large-scale mixed-size designs ZW Jiang, BY Su, YW Chang Proceedings of the 45th annual Design Automation Conference, 167-172, 2008 | 51 | 2008 |
Ubiquitous parallel computing from Berkeley, Illinois, and Stanford B Catanzaro, A Fox, K Keutzer, D Patterson, BY Su, M Snir, K Olukotun, ... IEEE micro 30 (2), 41-55, 2010 | 49 | 2010 |
Parallelizing CAD: A timely research agenda for EDA B Catanzaro, K Keutzer, BY Su Proceedings of the 45th annual Design Automation Conference, 12-17, 2008 | 49 | 2008 |
Robust large-scale machine learning in the cloud S Rendle, D Fetterly, EJ Shekita, B Su Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge …, 2016 | 28 | 2016 |
Fault tolerant distributed key-value storage AJ Smola, A Ahmed, EJ Shekita, SU Bor-Yiing, M Li US Patent 9,569,517, 2017 | 22 | 2017 |
Understanding and improving failure tolerant training for deep learning recommendation with partial recovery K Maeng, S Bharuka, I Gao, M Jeffrey, V Saraph, BY Su, C Trippel, J Yang, ... Proceedings of Machine Learning and Systems 3, 637-651, 2021 | 20 | 2021 |
An exact jumper insertion algorithm for antenna effect avoidance/fixing BY Su, YW Chang Proceedings of the 42nd annual Design Automation Conference, 325-328, 2005 | 20 | 2005 |
Barrier synchronization pattern RK Karmani, N Chen, BY Su, A Shali, R Johnson Workshop on Parallel Programming Patterns (ParaPLOP), 2009 | 15 | 2009 |
An optimal jumper insertion algorithm for antenna avoidance/fixing on general routing trees with obstacles BY Su, YW Chang, J Hu Proceedings of the 2006 international symposium on Physical design, 56-63, 2006 | 15 | 2006 |
Collective communication patterns N Chen, RK Karmani, A Shali, BY Su, R Johnson Workshop on Parallel Programming Patterns (ParaPLOP), 2009 | 10 | 2009 |
Considerations when evaluating microprocessor platforms M Anderson, B Catanzaro, J Chong, E Gonina, K Keutzer, CY Lai, ... 3rd USENIX Workshop on Hot Topics in Parallelism (HotPar 11), 2011 | 9 | 2011 |
Parallel BFS graph traversal on images using structured grid BY Su, TG Brutch, K Keutzer 2010 IEEE International Conference on Image Processing, 4489-4492, 2010 | 9 | 2010 |
An optimal jumper-insertion algorithm for antenna avoidance/fixing BY Su, YW Chang IEEE Transactions on Computer-Aided Design of Integrated Circuits and …, 2007 | 9 | 2007 |
Hierarchical training: Scaling deep recommendation models on large cpu clusters Y Huang, X Wei, X Wang, J Yang, BY Su, S Bharuka, D Choudhary, ... Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data …, 2021 | 8 | 2021 |
Robust large-scale machine learning in the cloud S Rendle, DC Fetterly, EJ Shekita, SU Bor-Yiing US Patent 10,482,392, 2019 | 8 | 2019 |
Parallel application library for object recognition BY Su University of California, Berkeley, 2012 | 8 | 2012 |