AMReX: a framework for block-structured adaptive mesh refinement W Zhang, A Almgren, V Beckner, J Bell, J Blaschke, C Chan, M Day, ... Journal of Open Source Software 4 (37), 1370-1370, 2019 | 389 | 2019 |
Accelerating Viola-Jones face detection to FPGA-level using GPUs D Hefenbrock, J Oberg, NTN Thanh, R Kastner, SB Baden 2010 18th IEEE Annual International Symposium on Field-Programmable Custom …, 2010 | 161 | 2010 |
Boxlib with tiling: An adaptive mesh refinement software framework W Zhang, A Almgren, M Day, T Nguyen, J Shalf, D Unat SIAM Journal on Scientific Computing 38 (5), S156-S172, 2016 | 68 | 2016 |
Tida: High-level programming abstractions for data locality management D Unat, T Nguyen, W Zhang, MN Farooqi, B Bastem, G Michelogiannakis, ... International Conference on High Performance Computing, 116-135, 2016 | 43 | 2016 |
Bamboo--Translating MPI applications to a latency-tolerant, data-driven form T Nguyen, P Cicotti, E Bylaska, D Quinlan, SB Baden SC'12: Proceedings of the International Conference on High Performance …, 2012 | 41 | 2012 |
The performance and energy efficiency potential of fpgas in scientific computing T Nguyen, S Williams, M Siracusa, C MacLean, D Doerfler, NJ Wright 2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High …, 2020 | 33 | 2020 |
FPGA‐based HPC accelerators: An evaluation on performance and energy efficiency T Nguyen, C MacLean, M Siracusa, D Doerfler, NJ Wright, S Williams Concurrency and Computation: Practice and Experience, e6570, 2021 | 29 | 2021 |
A software-based dynamic-warp scheduling approach for load-balancing the Viola–Jones face detection algorithm on GPUs T Nguyen, D Hefenbrock, J Oberg, R Kastner, S Baden Journal of Parallel and Distributed Computing 73 (5), 677-685, 2013 | 24 | 2013 |
Architectural Requirements for Deep Learning Workloads in HPC Environments KZ Ibrahim, T Nguyen, HA Nam, W Bhimji, S Farrell, L Oliker, M Rowan, ... 2021 International Workshop on Performance Modeling, Benchmarking and …, 2021 | 15 | 2021 |
Phase asynchronous AMR execution for productive and performant astrophysical flows MN Farooqi, T Nguyen, W Zhang, AS Almgren, J Shalf, D Unat SC18: International Conference for High Performance Computing, Networking …, 2018 | 10 | 2018 |
Perilla: Metadata-based optimizations of an asynchronous runtime for adaptive mesh refinement T Nguyen, D Unat, W Zhang, A Almgren, N Farooqi, J Shalf SC'16: Proceedings of the International Conference for High Performance …, 2016 | 8 | 2016 |
Automatic translation of MPI source into a latency-tolerant, data-driven form T Nguyen, P Cicotti, E Bylaska, D Quinlan, S Baden Journal of Parallel and Distributed Computing 106, 1-13, 2017 | 6 | 2017 |
Nonintrusive AMR asynchrony for communication optimization MN Farooqi, D Unat, T Nguyen, W Zhang, A Almgren, J Shalf European Conference on Parallel Processing, 682-694, 2017 | 5 | 2017 |
Lu factorization: Towards hiding communication overheads with a lookahead-free algorithm T Nguyen, SB Baden 2015 IEEE International Conference on Cluster Computing, 394-397, 2015 | 5 | 2015 |
Preliminary scaling results on multiple hybrid nodes of Knights Corner and Sandy Bridge processors T Nguyen, SB Baden Third International Workshop on Domain-Specific Languages and High-Level …, 2013 | 5 | 2013 |
Experiences Porting the SU3_Bench Microbenchmark to the Intel Arria 10 and Xilinx Alveo U280 FPGAs D Doerfler, F Fatollahi-Fard, C MacLean, T Nguyen, S Williams, N Wright, ... International Workshop on OpenCL, 1-9, 2021 | 3 | 2021 |
Asynchronous AMR on Multi-GPUs MN Farooqi, T Nguyen, W Zhang, AS Almgren, J Shalf, D Unat International Conference on High Performance Computing, 113-123, 2019 | 2 | 2019 |
AMReX W Zhang, A Myers, A Almgren, V Beckner, M Zingale, M Katz, K Gott, ... Lawrence Berkeley National Lab.(LBNL), Berkeley, CA (United States …, 2017 | 2 | 2017 |
Hardware Evaluation Analytical Modeling and Node Simulation: Benefits of Tighter GPU Integration B Austin, R Bair, K Barker, A Cabrera, A Chien, N Ding, J Firoz, K Ibrahim, ... Lawrence Berkeley National Lab.(LBNL), Berkeley, CA (United States), 2021 | 1 | 2021 |
Facilitating CoDesign with Automatic Code Similarity Learning T Nguyen, E Strohmaier, J Shalf 2021 IEEE/ACM 7th Workshop on the LLVM Compiler Infrastructure in HPC (LLVM …, 2021 | | 2021 |