Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0 O Fuhrer, T Chadha, T Hoefler, G Kwasniewski, X Lapillonne, D Leutwyler, ... Geoscientific Model Development 11 (4), 1665-1681, 2018 | 98 | 2018 |
Red-blue pebbling revisited: near optimal parallel matrix-matrix multiplication G Kwasniewski, M Kabić, M Besta, J VandeVondele, R Solcà, T Hoefler Proceedings of the International Conference for High Performance Computing …, 2019 | 54 | 2019 |
Using compiler techniques to improve automatic performance modeling A Bhattacharyya, G Kwasniewski, T Hoefler 2015 International Conference on Parallel Architecture and Compilation (PACT …, 2015 | 31 | 2015 |
Flexible communication avoiding matrix multiplication on FPGA with high-level synthesis J de Fine Licht, G Kwasniewski, T Hoefler Proceedings of the 2020 ACM/SIGDA International Symposium on Field …, 2020 | 28 | 2020 |
Sebs: A serverless benchmark suite for function-as-a-service computing M Copik, G Kwasniewski, M Besta, M Podstawski, T Hoefler Proceedings of the 22nd International Middleware Conference, 64-78, 2021 | 27 | 2021 |
Sisa: Set-centric instruction set architecture for graph mining on processing-in-memory systems M Besta, R Kanakagiri, G Kwasniewski, R Ausavarungnirun, J Beránek, ... MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture …, 2021 | 27 | 2021 |
A PCIe congestion-aware performance model for densely populated accelerator servers M Martinasso, G Kwasniewski, SR Alam, TC Schulthess, T Hoefler SC'16: Proceedings of the International Conference for High Performance …, 2016 | 25 | 2016 |
Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0, Geosci. Model Dev., 11, 1665–1681 O Fuhrer, T Chadha, T Hoefler, G Kwasniewski, X Lapillonne, D Leutwyler, ... gmd-11-1665-2018, 2018 | 22 | 2018 |
Extreme scale plasma turbulence simulations on top supercomputers worldwide W Tang, B Wang, S Ethier, G Kwasniewski, T Hoefler, KZ Ibrahim, ... SC'16: Proceedings of the International Conference for High Performance …, 2016 | 13 | 2016 |
Graphminesuite: Enabling high-performance and programmable graph mining algorithms with set algebra M Besta, Z Vonarburg-Shmaria, Y Schaffner, L Schwarz, G Kwasniewski, ... arXiv preprint arXiv:2103.03653, 2021 | 9 | 2021 |
Automatic complexity analysis of explicitly parallel programs T Hoefler, G Kwasniewski Proceedings of the 26th ACM symposium on Parallelism in algorithms and …, 2014 | 9 | 2014 |
Automatic performance modeling of hpc applications F Wolf, C Bischof, A Calotoiu, T Hoefler, C Iwainsky, G Kwasniewski, ... Software for Exascale Computing-SPPEXA 2013-2015, 445-465, 2016 | 6 | 2016 |
On the parallel i/o optimality of linear algebra kernels: near-optimal matrix factorizations G Kwasniewski, M Kabic, T Ben-Nun, AN Ziogas, JE Saethre, A Gaillard, ... Proceedings of the International Conference for High Performance Computing …, 2021 | 5 | 2021 |
Pebbles, graphs, and a pinch of combinatorics: Towards tight i/o lower bounds for statically analyzable programs G Kwasniewski, T Ben-Nun, L Gianinazzi, A Calotoiu, T Schneider, ... Proceedings of the 33rd ACM Symposium on Parallelism in Algorithms and …, 2021 | 5 | 2021 |
Motif prediction with graph neural networks M Besta, R Grob, C Miglioli, N Bernold, G Kwasniewski, G Gjini, ... arXiv preprint arXiv:2106.00761, 2021 | 5 | 2021 |
On the parallel i/o optimality of linear algebra kernels: near-optimal lu factorization G Kwasniewski, T Ben-Nun, AN Ziogas, T Schneider, M Besta, T Hoefler Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of …, 2021 | 4 | 2021 |
A scalable weakly-synchronous algorithm for solving partial differential equations K Aditya, T Gysi, G Kwasniewski, T Hoefler, DA Donzis, JH Chen arXiv preprint arXiv:1911.05769, 2019 | 2 | 2019 |
Lifting C Semantics for Dataflow Optimization A Calotoiu, T Ben-Nun, G Kwasniewski, JF Licht, T Schneider, P Schaad, ... arXiv preprint arXiv:2112.11879, 2021 | 1 | 2021 |
Deinsum: Practically I/O Optimal Multilinear Algebra AN Ziogas, G Kwasniewski, T Ben-Nun, T Schneider, T Hoefler arXiv preprint arXiv:2206.08301, 2022 | | 2022 |
Deinsum: Practically I/O Optimal Multilinear Algebra A Nikolaos Ziogas, G Kwasniewski, T Ben-Nun, T Schneider, T Hoefler arXiv e-prints, arXiv: 2206.08301, 2022 | | 2022 |