The objective of IMPACT (Illinois Microarchitecture Project using Algorithms and Compiler Technology) is to provide critical research, architecture innovation, and algorithm and compiler prototypes for heterogeneous parallel architectures. We achieve portable performance and energy efficiency for emerging real-world applications by developing novel hardware, compiler, and algorithmic solutions.


Recent & Highlighted Items

Paper Published: (January 20, 2021)

"PyTorch-Direct: Enabling GPU Centric Data Access for Very Large Graph Neural Network Training with Irregular Accesses", Seung Won Min, Kun Wu, Sitao Huang, Mert Hidayetoglu, Jinjun Xiong, Eiman Ebrahimi, Deming Chen, Wen-mei Hwu, [more...]

Paper Published: (November 16, 2020)

"Petascale XCT: 3D Image Reconstruction with Hierarchical Communications on Multi-GPU Nodes", Mert Hidayetoglu, Tekin Bicer, Simon Garcia de Gonzalo, Bin Ren, Vincent De Andrade, Doga Gursoy, Rajkumar Kettimuthu, Ian Foster, Wen-mei Hwu, International Conference for High Performance Computing, Networking, Storage and Analysis (SC20). (Best Paper Award) [more...]

Paper Published: (October 26, 2020)

"EMOGI: Efficient Memory-access for Out-of-memory Graph-traversal in GPUs", Seung Won Min, Vikram Sharma Mailthody, Zaid Qureshi, Jinjun Xiong, Eiman Ebrahimi, Wen-mei Hwu, PVLDB, Volume 14, No. 2, October 2020. [more...]

SC20 Paper Named a Best Paper & Best Student Paper Finalist (September 4, 2020)

Mert Hidayetoglu's follow-up work of his internship at Argonne National Laboratory is nominated for the best paper and best student paper awards at SC20 of supercomputing conference series. His work is on iterative reconstruction of 3D X-ray tomography at unprecedented scale. Mert's code scales well up to 24,576 V100 GPUs on Summit supercomputer and reconstructs an 11Kx11Kx9K multi-scale mouse brain image under three minutes. The reconstruction reaches 65 PFLOPS sustained single-precision throughput: 34% of Summit's theoretical peak performance.


The technical highlight of Mert's paper is the hierarchical communication strategy that alleviates the communication bottleneck of distributed projection and backprojection computations: a few additional very fast intra-node communications reduces the slow inter-node communication volume by 60%. Upon APS upgrade, Mert's code will be used for production at Aurora - world's first exascale computer - with multi-GPU node architecture.

Petascale XCT: 3D Image Reconstruction with Hierarchical Communications on Multi-GPU Nodes (PDF)

IBM-Illinois Team Wins the MIT/Amazon/IEEE Sparse DNN Graph Challenge (August 26, 2020)

The team (Mert Hidayetoglu, Carl Pearson, Vikram Mailthody, Jinjun Xiong, Rakesh Nagi, and Wen-mei Hwu) of IBM-Illinois Center for Cognitive Computing Systems Research (C3SR) won the MIT/Amazon/IEEE Sparse DNN Graph Challenge 2020. The team develops efficient GPU algorithms to use of on-chip memory to save energy and time for unstructured data access in sparse computation. The proposed implementation reduces inference latency by an order of magnitude compared to the 2019 winner. Their paper includes performance benchmarking on 12 sparse deep neural network models with various sizes; and demonstrates an at-scale 180 TeraEdges/Second sustained inference throughput on Summit supercomputer. Thanks to Eiman Ebrahimi of NVIDIA, the paper also involves the first performance benchmarking of the latest-generation Ampere A100 GPU in the literature. They will present their work at HPEC'20 on September.

At-Scale Sparse Deep Neural Network Inference With Efficient GPU Implementation (PDF)
(View Archive of Highlighted Items)