Sebs: A serverless benchmark suite for function-as-a-service computing M Copik, G Kwasniewski, M Besta, M Podstawski, T Hoefler Proceedings of the 22nd International Middleware Conference, 64-78, 2021 | 152 | 2021 |
Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0 O Fuhrer, T Chadha, T Hoefler, G Kwasniewski, X Lapillonne, D Leutwyler, ... Geoscientific Model Development 11 (4), 1665-1681, 2018 | 139 | 2018 |
Red-blue pebbling revisited: near optimal parallel matrix-matrix multiplication G Kwasniewski, M Kabić, M Besta, J VandeVondele, R Solcà, T Hoefler Proceedings of the International Conference for High Performance Computing …, 2019 | 105 | 2019 |
Sisa: Set-centric instruction set architecture for graph mining on processing-in-memory systems M Besta, R Kanakagiri, G Kwasniewski, R Ausavarungnirun, J Beránek, ... MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture …, 2021 | 99 | 2021 |
Flexible communication avoiding matrix multiplication on FPGA with high-level synthesis J de Fine Licht, G Kwasniewski, T Hoefler Proceedings of the 2020 ACM/SIGDA International Symposium on Field …, 2020 | 68 | 2020 |
Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0, Geosci. Model Dev., 11, 1665–1681 O Fuhrer, T Chadha, T Hoefler, G Kwasniewski, X Lapillonne, D Leutwyler, ... | 44 | 2018 |
A PCIe congestion-aware performance model for densely populated accelerator servers M Martinasso, G Kwasniewski, SR Alam, TC Schulthess, T Hoefler SC'16: Proceedings of the International Conference for High Performance …, 2016 | 39 | 2016 |
Using compiler techniques to improve automatic performance modeling A Bhattacharyya, G Kwasniewski, T Hoefler 2015 International Conference on Parallel Architecture and Compilation (PACT …, 2015 | 37 | 2015 |
Graphminesuite: Enabling high-performance and programmable graph mining algorithms with set algebra M Besta, Z Vonarburg-Shmaria, Y Schaffner, L Schwarz, G Kwasniewski, ... arXiv preprint arXiv:2103.03653, 2021 | 35 | 2021 |
Motif prediction with graph neural networks M Besta, R Grob, C Miglioli, N Bernold, G Kwasniewski, G Gjini, ... Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and …, 2022 | 34 | 2022 |
On the parallel i/o optimality of linear algebra kernels: Near-optimal matrix factorizations G Kwasniewski, M Kabic, T Ben-Nun, AN Ziogas, JE Saethre, A Gaillard, ... Proceedings of the International Conference for High Performance Computing …, 2021 | 22 | 2021 |
Pebbles, graphs, and a pinch of combinatorics: Towards tight I/O lower bounds for statically analyzable programs G Kwasniewski, T Ben-Nun, L Gianinazzi, A Calotoiu, T Schneider, ... Proceedings of the 33rd ACM Symposium on Parallelism in Algorithms and …, 2021 | 18 | 2021 |
Extreme scale plasma turbulence simulations on top supercomputers worldwide W Tang, B Wang, S Ethier, G Kwasniewski, T Hoefler, KZ Ibrahim, ... SC'16: Proceedings of the International Conference for High Performance …, 2016 | 16 | 2016 |
On the parallel I/O optimality of linear algebra kernels: Near-optimal LU factorization G Kwasniewski, T Ben-Nun, AN Ziogas, T Schneider, M Besta, T Hoefler Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of …, 2021 | 11 | 2021 |
Probgraph: High-performance and high-accuracy graph mining with probabilistic set representations M Besta, C Miglioli, PS Labini, J Tětek, P Iff, R Kanakagiri, S Ashkboos, ... SC22: International Conference for High Performance Computing, Networking …, 2022 | 10 | 2022 |
Automatic complexity analysis of explicitly parallel programs T Hoefler, G Kwasniewski Proceedings of the 26th ACM symposium on Parallelism in algorithms and …, 2014 | 10 | 2014 |
Topologies of reasoning: Demystifying chains, trees, and graphs of thoughts M Besta, F Memedi, Z Zhang, R Gerstenberger, N Blach, P Nyczyk, ... arXiv preprint arXiv:2401.14295, 2024 | 9 | 2024 |
Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0. Geoscientific Model Development 11, 4 (2018), 1665–1681 O Fuhrer, T Chadha, T Hoefler, G Kwasniewski, X Lapillonne, D Leutwyler, ... | 7 | 2018 |
Automatic performance modeling of HPC applications F Wolf, C Bischof, A Calotoiu, T Hoefler, C Iwainsky, G Kwasniewski, ... Software for Exascale Computing-SPPEXA 2013-2015, 445-465, 2016 | 7 | 2016 |
High-Performance and Programmable Attentional Graph Neural Networks with Global Tensor Formulations M Besta, P Renc, R Gerstenberger, P Sylos Labini, A Ziogas, T Chen, ... Proceedings of the International Conference for High Performance Computing …, 2023 | 6 | 2023 |