The objective of IMPACT (Illinois Microarchitecture Project using Algorithms and Compiler Technology) is to provide critical research, architecture innovation, and algorithm and compiler prototypes for heterogeneous parallel architectures. We achieve portable performance and energy efficiency for emerging real-world applications by developing novel hardware, compiler, and algorithmic solutions.
 

 

Recent & Highlighted Items

SC20 Paper Named a Best Paper & Best Student Paper Finalist (September 4, 2020)


Mert Hidayetoglu's follow-up work of his internship at Argonne National Laboratory is nominated for the best paper and best student paper awards at SC20 of supercomputing conference series. His work is on iterative reconstruction of 3D X-ray tomography at unprecedented scale. Mert's code scales well up to 24,576 V100 GPUs on Summit supercomputer and reconstructs an 11Kx11Kx9K multi-scale mouse brain image under three minutes. The reconstruction reaches 65 PFLOPS sustained single-precision throughput: 34% of Summit's theoretical peak performance.

mouse_brain.png

The technical highlight of Mert's paper is the hierarchical communication strategy that alleviates the communication bottleneck of distributed projection and backprojection computations: a few additional very fast intra-node communications reduces the slow inter-node communication volume by 60%. Upon APS upgrade, Mert's code will be used for production at Aurora - world's first exascale computer - with multi-GPU node architecture.


Petascale XCT: 3D Image Reconstruction with Hierarchical Communications on Multi-GPU Nodes (PDF)
  

IBM-Illinois Team Wins the MIT/Amazon/IEEE Sparse DNN Graph Challenge (August 26, 2020)

The team (Mert Hidayetoglu, Carl Pearson, Vikram Mailthody, Jinjun Xiong, Rakesh Nagi, and Wen-mei Hwu) of IBM-Illinois Center for Cognitive Computing Systems Research (C3SR) won the MIT/Amazon/IEEE Sparse DNN Graph Challenge 2020. The team develops efficient GPU algorithms to use of on-chip memory to save energy and time for unstructured data access in sparse computation. The proposed implementation reduces inference latency by an order of magnitude compared to the 2019 winner. Their paper includes performance benchmarking on 12 sparse deep neural network models with various sizes; and demonstrates an at-scale 180 TeraEdges/Second sustained inference throughput on Summit supercomputer. Thanks to Eiman Ebrahimi of NVIDIA, the paper also involves the first performance benchmarking of the latest-generation Ampere A100 GPU in the literature. They will present their work at HPEC'20 on September.


At-Scale Sparse Deep Neural Network Inference With Efficient GPU Implementation (PDF)

Paper Accepted: (June 29, 2020)

"Petascale XCT: 3D Image Reconstruction with Hierarchical Communications on Multi-GPU Nodes", Mert Hidayetoglu, Tekin Bicer, Simon Garcia de Gonzalo, Bin Ren, Vincent De Andrade, Doga Gursoy, Rajkumar Kettimuthu, Ian Foster, Wen-mei Hwu, In the proceedings of the 2020 ACM International Conference for High Performance Computing, Networking, Storage and Analysis. ACM (SC20). (Best Paper & Best Student Paper Finalist) [more...]

Paper Published: (June 14, 2020)

"EMOGI: Efficient Memory-access for Out-of-memory Graph-traversal In GPUs", Seung Won Min, Vikram Sharma Mailthody, Zaid Qureshi, Jinjun Xiong, Eiman Ebrahimi, Wen-mei Hwu, https://arxiv.org/pdf/2006.06890.pdf. [more...]

Paper Published: (May 18, 2020)

"Benanza: Automatic Benchmark Generation to Compute "Lower-bound" Latency and Inform Optimizations of Deep Learning Models on GPUs", Cheng Li, Abdul Dakkak, Jinjun Xiong, Wen-mei Hwu, IPDPS. [more...]

  
(View Archive of Highlighted Items)