Obserwuj
Jing Pu
Tytuł
Cytowane przez
Cytowane przez
Rok
EIE: Efficient inference engine on compressed deep neural network
S Han, X Liu, H Mao, J Pu, A Pedram, MA Horowitz, WJ Dally
ACM SIGARCH Computer Architecture News 44 (3), 243-254, 2016
30762016
Tetris: Scalable and efficient neural network acceleration with 3d memory
M Gao, J Pu, X Yang, M Horowitz, C Kozyrakis
Proceedings of the Twenty-Second International Conference on Architectural …, 2017
6342017
Interstellar: Using halide's scheduling language to analyze dnn accelerators
X Yang, M Gao, Q Liu, J Setter, J Pu, A Nayak, S Bell, K Cao, H Ha, ...
Proceedings of the Twenty-Fifth International Conference on Architectural …, 2020
2162020
Tangram: Optimized coarse-grained dataflow for scalable nn accelerators
M Gao, X Yang, J Pu, M Horowitz, C Kozyrakis
Proceedings of the Twenty-Fourth International Conference on Architectural …, 2019
1572019
Programming heterogeneous systems from an image processing DSL
J Pu, S Bell, X Yang, J Setter, S Richardson, J Ragan-Kelley, M Horowitz
ACM Transactions on Architecture and Code Optimization (TACO) 14 (3), 1-25, 2017
1542017
DNN dataflow choice is overrated
X Yang, M Gao, J Pu, A Nayak, Q Liu, SE Bell, JO Setter, K Cao, H Ha, ...
arXiv preprint arXiv:1809.04070 6, 5, 2018
1012018
A systematic approach to blocking convolutional neural networks
X Yang, J Pu, BB Rister, N Bhagdikar, S Richardson, S Kvatinsky, ...
arXiv preprint arXiv:1606.04209, 2016
772016
Deep compression and EIE: Efficient inference engine on compressed deep neural network.
S Han, X Liu, H Mao, J Pu, A Pedram, M Horowitz, B Dally
Hot Chips Symposium, 1-6, 2016
572016
FPU generator for design space exploration
S Galal, O Shacham, JS Brunhaver II, J Pu, A Vassiliev, M Horowitz
2013 IEEE 21st Symposium on Computer Arithmetic, 25-34, 2013
392013
A 220pJ/pixel/frame CMOS image sensor with partial settling readout architecture
S Ji, J Pu, BC Lim, M Horowitz
2016 IEEE Symposium on VLSI Circuits (VLSI-Circuits), 1-2, 2016
132016
FPMax: A 106GFLOPS/W at 217GFLOPS/mm2 single-precision FPU, and a 43.7 GFLOPS/W at 74.6 GFLOPS/mm2 double-precision FPU, in 28nm UTBB FDSOI
J Pu, S Galal, X Yang, O Shacham, M Horowitz
arXiv preprint arXiv:1606.07852, 2016
112016
MDig: Multi-digit recognition using convolutional nerual network on mobile
X Yang, J Pu
Proc. Yang2015 MDigMR, 1-10, 2015
102015
Compiling algorithms for heterogeneous systems
S Bell, J Pu, J Hegarty, M Horowitz, M Martonosi
Morgan & Claypool Publishers, 2018
72018
Performance Investigation on p-Type Si-, Ge-, and Ge–Si Core–Shell Nanowire Schottky Barrier Transistors
J Pu, L Sun, RQ Han
Japanese Journal of Applied Physics 50 (4S), 04DN10, 2011
52011
Retrospective: Eie: Efficient inference engine on sparse and compressed neural network
S Han, X Liu, H Mao, J Pu, A Pedram, MA Horowitz, WJ Dally
arXiv preprint arXiv:2306.09552, 2023
22023
Programming Heterogeneous Systems from an Image Processing Domain Specific Language
J Pu
Stanford University, 2017
22017
Image Processing with Stencil Pipelines
S Bell, J Pu, J Hegarty, M Horowitz
Compiling Algorithms for Heterogeneous Systems, 27-31, 2018
12018
Interstellar
X Yang, M Gao, Q Liu, J Setter, J Pu, A Nayak, S Bell, K Cao, H Ha, ...
Proceedings of the Twenty-Fifth International Conference on Architectural …, 2020
2020
Darkroom: A Stencil Language for Image Processing
S Bell, J Pu, J Hegarty, M Horowitz
Compiling Algorithms for Heterogeneous Systems, 33-50, 2018
2018
Interfacing with Specialized Hardware
S Bell, J Pu, J Hegarty, M Horowitz
Compiling Algorithms for Heterogeneous Systems, 69-80, 2018
2018
Nie można teraz wykonać tej operacji. Spróbuj ponownie później.
Prace 1–20