Publications

2021

	"Large Graph Convolutional Network Training with GPU-Oriented Data Communication Architecture", Seung Won Min, Kun Wu, Sitao Huang, Mert Hidayetoglu, Jinjun Xiong, Eiman Ebrahimi, Deming Chen, Wen-mei Hwu, Proceedings of the VLDB Endowment, Vol. 14, No. 11. [more...]

	"PyTorch-Direct: Enabling GPU Centric Data Access for Very Large Graph Neural Network Training with Irregular Accesses", Seung Won Min, Kun Wu, Sitao Huang, Mert Hidayetoglu, Jinjun Xiong, Eiman Ebrahimi, Deming Chen, Wen-mei Hwu, https://arxiv.org/abs/2101.07956. [more...]

2020

	"Petascale XCT: 3D Image Reconstruction with Hierarchical Communications on Multi-GPU Nodes", Mert Hidayetoglu, Tekin Bicer, Simon Garcia de Gonzalo, Bin Ren, Vincent De Andrade, Doga Gursoy, Rajkumar Kettimuthu, Ian Foster, Wen-mei Hwu, International Conference for High Performance Computing, Networking, Storage and Analysis (SC20). (Best Paper Award) [more...]

	"EMOGI: Efficient Memory-access for Out-of-memory Graph-traversal in GPUs", Seung Won Min, Vikram Sharma Mailthody, Zaid Qureshi, Jinjun Xiong, Eiman Ebrahimi, Wen-mei Hwu, PVLDB, Volume 14, No. 2, October 2020. [more...]

	"At-Scale Sparse Deep Neural Network Inference with Efficient GPU Implementation", Mert Hidayetoglu, Carl Pearson, Vikram Sharma Mailthody, Eiman Ebrahimi, Jinjun Xiong, Rakesh Nagi, Wen-mei Hwu, 2020 IEEE High Performance Extreme Computing Conference, HPEC 2020. (Graph Challenge Champion) [more...]

	"Profiling and Characterization of Deep Learning Model Inference on CPU", Yanli Qian, Master's Thesis. [more...]

	"Node-Aware Stencil Communication on Heterogeneous Supercomputers", Carl Pearson, Mert Hidayetoglu, Mohamad Almasri, Omer Anjum, I-Hsin Chung, Jinjun Xiong, Wen-mei Hwu, 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). IEEE, 2020.. [more...]

	"Benanza: Automatic Benchmark Generation to Compute "Lower-bound" Latency and Inform Optimizations of Deep Learning Models on GPUs", Cheng Li, Abdul Dakkak, Jinjun Xiong, Wen-mei Hwu, IPDPS. [more...]

	"The Design and Implementation of a Scalable DL Benchmarking Platform", Cheng Li, Abdul Dakkak, Jinjun Xiong, Wen-mei Hwu, CLOUD 2020. [more...]

	"DLBricks: Composable Benchmark Generation to Reduce Deep Learning Benchmarking Effort on CPUs", Cheng Li, Abdul Dakkak, Jinjun Xiong, Wen-mei Hwu, ICPE. [more...]

	"DLSpec: A Deep Learning Task Exchange Specification", Abdul Dakkak, Cheng Li, Jinjun Xiong, Wen-mei Hwu, USENIX OpML 20. [more...]

	"XSP: Across-Stack Profiling and Analysis of Machine Learning Models on GPUs", Cheng Li, Abdul Dakkak, Jinjun Xiong, Wen-mei Hwu, IPDPS. (Best Paper) [more...]

	"The Design and Implementation of the Wolfram Language Compiler", Abdul Dakkak, Wen-mei Hwu, In the proceedings of the 2020 IEEE International Symposium on Code Generation and Optimization (CGO20). [more...]

	"MLModelScope: A Distributed Platform for Model Evaluation and Benchmarking at Scale", Abdul Dakkak, Cheng Li, Jinjun Xiong, Wen-mei Hwu, https://arxiv.org/abs/2002.08295. [more...]

2019

	"DLBricks: Composable Benchmark Generation to Reduce Deep Learning Benchmarking Effort on CPUs", Cheng Li, Abdul Dakkak, Jinjun Xiong, Wen-mei Hwu, https://arxiv.org/abs/1911.07967. [more...]

	"Benanza: Automatic Benchmark Generation to Compute "Lower-bound" Latency and Inform Optimizations of Deep Learning Models on GPUs", Cheng Li, Abdul Dakkak, Jinjun Xiong, Wen-mei Hwu, https://arxiv.org/abs/1911.06922. [more...]

	"MemXCT: Memory-Centric X-Ray CT Reconstruction with Massive Parallelization", Mert Hidayetoglu, Tekin Bicer, Simon Garcia de Gonzalo, Bin Ren, Doga Gursoy, Rajkumar Kettimuthu, Ian Foster, Wen-mei Hwu, In the proceedings of the 2019 ACM International Conference for High Performance Computing, Networking, Storage and Analysis. ACM (SC19). (SC20 Perproducibility Challenge Benchmark) [more...]

	"PaRe: A Paper-Reviewer Matching Approach Using a Common Topic Space", Omer Anjum, Hongyu Gong, Suma Bhat, Wen-mei Hwu, Jinjun Xiong, a. [more...]

	"DeepStore: In-Storage Acceleration for Intelligent Queries", Vikram Sharma Mailthody, Zaid Qureshi, Weixin Liang, Ziyan Feng, Simon Garcia de Gonzalo, Youjie Li, Hubertus Franke, Jinjun Xiong, Jian Huang, Wen-mei Hwu, In the Proceedings of the 52 Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'19), Columbus, OH, USA. [more...]

	"Update on Triangle Counting on GPU", Carl Pearson, Mohamad Almasri, Omer Anjum, Vikram Sharma Mailthody, Zaid Qureshi, Rakesh Nagi, Jinjun Xiong, Wen-mei Hwu, 2019 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, USA, 2019, pp. 1-7, doi: 10.1109/HPEC.2019.8916547.. [more...]

	"Analysis and Optimization of I/O Cache Coherency Strategies for SoC-FPGA Device", Seung Won Min, Sitao Huang, Mohamed Aly, Jinjun Xiong, Deming Chen, Wen-mei Hwu, 29th International Conference on Field-Programmable Logic and Applications (FPL 2019). [more...]

	"Across-Stack Profiling and Characterization of Machine Learning Models on GPUs", Cheng Li, Abdul Dakkak, Jinjun Xiong, https://arxiv.org/abs/1908.06869. [more...]

	"An Efficient GPU Implementation for Higher-Order 3D Stencils", Omer Anjum, Simon Garcia de Gonzalo, Mert Hidayetoglu, Wen-mei Hwu, In the 21st IEEE International Conference on High Performance Computing and Communications (HPCC19). (HPCC Best Paper Award) [more...]

	"TrIMS: Transparent and Isolated Model Sharing for Low Latency Deep Learning Inference in Function-as-a-Service", Abdul Dakkak, Cheng Li, Simon Garcia de Gonzalo, Jinjun Xiong, Wen-mei Hwu, IEEE International Conference on Cloud Computing, July 8-13, 2019, Milan, Italy. [more...]

	"Accelerating Reduction and Scan Using Tensor Core Units", Abdul Dakkak, Cheng Li, Jinjun Xiong, Wen-mei Hwu, ICS 2019: International Conference on Supercomputing, June 26-28, Phoenix AZ. [more...]

	"Frustrated with Replicating Claims of a Shared Model? A Solution", Abdul Dakkak, Cheng Li, Jinjun Xiong, https://arxiv.org/abs/1811.09737. [more...]

	"Benchmarking and Understanding ML Inference", Cheng Li, Abdul Dakkak, Jinjun Xiong, Wen-mei Hwu, https://arxiv.org/abs/1904.12437. [more...]

	"PUMA: A Programmable Ultra-efficient Memristor-based Accelerator for Machine Learning Inference", Aayush Ankit, Izzat El Hajj, Sai Rahul Chalamalasetti, Geoffrey Ndu, Martin Foltin, R. Stanley Williams, Paolo Faraboschi, Wen-mei Hwu, John Paul Strachan, Kaushik Roy, Dejan Milojicic, Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2019.. [more...]

	"FlatFlash: Exploiting the Byte-Accessibility of SSDs within A Unified Memory-Storage Hierarchy", Ahmed Abulila, Vikram Sharma Mailthody, Zaid Qureshi, Jian Huang, Nam Sung Kim, Jinjun Xiong, Wen-mei Hwu, In the 24th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS'19), April 13th April 17th, Providence, RI, USA . [more...]

	"Evaluating Characteristics of CUDA Communication Primitives on High-Bandwidth Interconnects", Carl Pearson, Abdul Dakkak, Cheng Li, Jinjun Xiong, Wen-mei Hwu, Pearson, Carl, et al. "Evaluating Characteristics of CUDA Communication Primitives on High-Bandwidth Interconnects." Proceedings of the 10th ACM/SPEC International Conference on Performance Engineering. ACM, 2019.. (ICPE Best Paper Award) [more...]

	"Collaborative Computing on Heterogeneous CPU-FPGA Architectures Using OpenCL", Sitao Huang, Li-Wen Chang, Izzat El Hajj, Simon Garcia de Gonzalo, Juan Gómez-Luna, Sai Rahul Chalamalasetti, Mohamed Aly, Dejan Milojicic, Onur Mutlu, Deming Chen, Wen-mei Hwu, Proceedings of the 10th ACM/SPEC International Conference on Performance Engineering. ACM, 2019.. [more...]

	"Automatic Generation of Warp-Level Primitives and Atomic Instructions for Fast and Portable Parallel Reduction on GPUs", Simon Garcia de Gonzalo, Sitao Huang, Juan Gómez-Luna, Simon D. Hammond, Onur Mutlu, Wen-mei Hwu, In the proceedings of the 2019 IEEE International Symposium on Code Generation and Optimization (CGO19). [more...]

2018

	"TrIMS: Transparent and Isolated Model Sharing for LowLatency Deep Learning Inference in Function as aService Environments", Abdul Dakkak, Cheng Li, Simon Garcia de Gonzalo, Jinjun Xiong, Wen-mei Hwu, Systems for ML at NIPS 2018. [more...]

	"Accelerating Reduction and Scan Using Tensor Core Units", Abdul Dakkak, Cheng Li, Jinjun Xiong, Wen-mei Hwu, CoRR, abs/1811.09736.. [more...]

	"MLModelScope: Evaluate and Measure ML Models within AI Pipelines", Abdul Dakkak, Cheng Li, arXiv preprint arXiv:1811.09737 (2018).. [more...]

	"Accelerator Architectures A Ten-Year Retrospective", Wen-mei Hwu, Sanjay J. Patel, in IEEE Micro, vol. 38, no. 6, pp. 56-62, 1 Nov.-Dec. 2018.. [more...]

	"Application-Transparent Near-Memory Processing Architecture with Memory Channel Network", Mohammad Alian, Seung Won Min, Hadi Asgharimoghaddam, Ashutosh Dhar, Dong Kai Wang, Thomas Roewer, Adam McPadden, Oliver O'Halloran, Deming Chen, Jinjun Xiong, Daehoon Kim, Wen-mei Hwu, Nam Sung Kim, In Proceedings of 51st Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2018, Fukuoka, Japan, October 20-24, 2018. [more...]

	"Triangle Counting and Truss Decomposition using FPGA", Sitao Huang, Mohamed Aly, Cong Hao, Qin li, Vikram Sharma Mailthody, Ketan Date, Jinjun Xiong, Deming Chen, Rakesh Nagi, Wen-mei Hwu, In Proceedings of the 2018 IEEE High Performance extreme Computing Conference (HPEC). [more...]

	"Collaborative (CPU + GPU) Algorithms for Triangle Counting and Truss Decomposition", Vikram Sharma Mailthody, Ketan Date, Zaid Qureshi, Carl Pearson, Rakesh Nagi, Jinjun Xiong, Wen-mei Hwu, In Proceedings of the 2018 IEEE High Performance extreme Computing Conference (HPEC). [more...]

	"Semi-Coherent DMA: An Alternative I/O Coherency Management for Embedded Systems", Seung Won Min, Mohammad Alian, Nam Sung Kim, Wen-mei Hwu, in IEEE Computer Architecture Letters, vol. 17, no. 2, pp. 221-224, 1 July-Dec. 2018. . [more...]

	"Heterogeneous Application and System Modeling", Carl Pearson, MS Thesis. [more...]

	"A Fast and Massively-Parallel Solver for Nonlinear Tomographic Image Reconstruction", Mert Hidayetoglu, Carl Pearson, Izzat El Hajj, Levent Gurel, Weng Cho Chew, Wen-mei Hwu, In Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2018. [more...]

	"NUMA-Aware Data-Transfer Measurements for Power/NVLink Multi-GPU Systems", Carl Pearson, Zehra Sura, Wen-mei Hwu, International Conference on High Performance Computing. Springer, Cham, 2018.. [more...]

2017

	"Revisiting Online Autotuning for Sparse-Matrix Vector Multiplication Kernels on Next-Generation Architectures", Simon Garcia de Gonzalo, Simon D. Hammond, Christian R. Trott, Wen-mei Hwu, In the 19th IEEE International Conference on High Performance Computing and Communications (HPCC17). [more...]

	"Scaling Analysis of a Hierarchical Parallelization of Large Inverse Multiple-Scattering Solutions", Mert Hidayetoglu, Carl Pearson, Izzat El Hajj, Weng Cho Chew, Levent Gurel, Wen-mei Hwu, Poster at the International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2017.. [more...]

	"Rebooting the Data Access Hierarchy of Computing Systems", Wen-mei Hwu, Izzat El Hajj, Simon Garcia de Gonzalo, Carl Pearson, Nam Sung Kim, Deming Chen, Jinjun Xiong, Zehra Sura, In IEEE International Conference on Rebooting Computing (ICRC), 2017.. [more...]

	"Generalize or Die: Operating Systems Support for Memristor-based Accelerators", Pedro Bruel, Sai Rahul Chalamalasetti, Chris Dalton, Izzat El Hajj, Alfredo Goldman, Catherine Graves, Wen-mei Hwu, Phil Laplante, Dejan Milojicic, Geoffrey Ndu, John Paul Strachan, In IEEE International Conference on Rebooting Computing (ICRC), 2017. [more...]

	"SAVI Objects: Sharing and Virtuality Incorporated", Izzat El Hajj, Thomas B. Jablin, Dejan Milojicic, Wen-mei Hwu, In Proceedings of the 2017 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications. (OOPSLA'17).. [more...]

	"Comparative Performance Evaluation of Multi-GPU MLFMM Implementation for 2-D VIE Problems", Carl Pearson, Mert Hidayetoglu, Wen-mei Hwu, Computing and Electromagnetics International Workshop 2017. [more...]

	"Fast DBIM Solutions on Supercomputers with Frequency-Hopping for Imaging of Large and High-Contrast Objects", Mert Hidayetoglu, Anthony Podkowa, Michael Oelze, Wen-mei Hwu, Weng Cho Chew, Progress in Electromagnetics Research Symposium (PIERS 2017), St. Petersburg, Russia. [more...]

	"RAI: A Scalable Project Submission System for Parallel Programming Courses", Abdul Dakkak, Carl Pearson, Cheng Li, Parallel and Distributed Processing Symposium Workshops, 2017 IEEE International.. [more...]

	"Chai: Collaborative Heterogeneous Applications for Integrated-architectures", Juan Gómez-Luna, Izzat El Hajj, Li-Wen Chang, Victor Garcia-Flores, Simon Garcia de Gonzalo, Thomas B. Jablin, Antonio J. Peña, Wen-mei Hwu, IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2017. [more...]

	"Collaborative Computing for Heterogeneous Integrated Systems", Li-Wen Chang, Juan Gómez-Luna, Izzat El Hajj, Sitao Huang, Deming Chen, Wen-mei Hwu, ACM/SPEC International Conference on Performance Engineering (ICPE), 2017 . [more...]

	"Large Inverse-Scattering Solutions with DBIM on GPU-Enabled Supercomputers", Mert Hidayetoglu, Carl Pearson, Weng Cho Chew, Levent Gurel, Wen-mei Hwu, Applied and Computational Electromagnetics Symposium (ACES 2017), Florence, Italy. For the speacial session: Big Data Aspects. [more...]

	"Hardware Acceleration of the Pair-HMM Algorithm for DNA Variant Calling", Sitao Huang, Gowthami Jayashri Manikandan, Anand Ramachandran, Kyle Rupnow, Wen-mei Hwu, Deming Chen, International Symposium on Field-Programmable Gate Arrays (ISFPGA), 2017. [more...]

	"Incorporating Multiple Scattering in Imaging with Iterative Born Methods", Mert Hidayetoglu, Anthony Podkowa, Michael Oelze, Levent Gurel, Wen-mei Hwu, Weng Cho Chew, URSI National Radio Science Meeting (URSI-NSRM 2017), Boulder, CO. [more...]

2016

	"KLAP: Kernel Launch Aggregation and Promotion for Optimizing Dynamic Parallelism", Izzat El Hajj, Juan Gómez-Luna, Cheng Li, Li-Wen Chang, Dejan Milojicic, Wen-mei Hwu, Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016. [more...]

	"Efficient Kernel Synthesis for Performance Portable Programming", Li-Wen Chang, Izzat El Hajj, Christopher I. Rodrigues, Juan Gómez-Luna, Wen-mei Hwu, Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016. [more...]

	"Parallel Solutions of Inverse Multiple Scattering Problems with Born-Type Fast Solvers", Mert Hidayetoglu, Anthony Podkowa, Michael Oelze, Wen-mei Hwu, Weng Cho Chew, Progress in Electromagnetics Research Symposium (PIERS 2016), Shanghai, China. [more...]

	"WebGPU: A Scalable Online Development Platform for GPU Programming Courses", Abdul Dakkak, Carl Pearson, Wen-mei Hwu, Proceedings of the 6th NSF/TCPP Workshop on Parallel and Distributed Computing Education (EduPar-16), 2016. [more...]

	"Parallel Merge for Many-Core Architectures", Jie Lv, Diss. 2016.. [more...]

	"DySel: Lightweight Dynamic Selection for Kernel-based Data-parallel Programming Model", Li-Wen Chang, Hee-Seok Kim, Wen-mei Hwu, Proceedings of the 21th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '16) . [more...]

	"SpaceJMP: Programming with Multiple Virtual Address Spaces", Izzat El Hajj, Alexander Merritt, Gerd Zellweger, Dejan Milojicic, Reto Achermann, Paolo Faraboschi, Wen-mei Hwu, Timothy Roscoe, Karsten Schwan, Proceedings of the 21th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '16). (HiPEAC Paper Award) [more...]

	"A Programming System for Future Proofing Performance Critical Libraries", Li-Wen Chang, Izzat El Hajj, Hee-Seok Kim, Juan Gómez-Luna, Abdul Dakkak, Wen-mei Hwu, Proceedings of the 21th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 2016) [poster]. [more...]

2015

	"Enhancing the Usability and Utilization of Accelerated Architectures via Docker", Nicholas Haydel, Sandra Gesing, Ian Taylor, Gregory Madey, Abdul Dakkak, Simon Garcia de Gonzalo, Wen-mei Hwu, In the 2015 IEEE/ACM 8th International Conference on Utility and Cloud Computing (UCC15). [more...]

	"Compiler and Runtime Techniques for Bulk-Synchronous Programming Models on CPU Architectures", Hee-Seok Kim, University of Illinois Doctoral Disertation, November 2015. [more...]

	"In-Place Data Sliding Algorithms for Many-Core Architectures", Juan Gómez-Luna, Li-Wen Chang, Wen-mei Hwu, I-Jui Sung, Nicolás Guil, Parallel Processing, 2015 44th International Conference on (ICPP 2015) . [more...]

	"Transitioning HPC software to exascale heterogeneous computing", Wen-mei Hwu, Li-Wen Chang, Hee-Seok Kim, Abdul Dakkak, Izzat El Hajj, Computational Electromagnetics International Workshop (CEM), 2015. [more...]

	"Automatic Parallelization of Kernels in Shared-Memory Multi-GPU Nodes", Javier Cabezas, Lluis Vilanova, Isaac Gelado, Thomas B. Jablin, Nacho Navarro, Wen-mei Hwu, Proceedings of the 29th ACM on International Conference on Supercomputing (ICS '15). [more...]

	"In-Place Matrix Transposition on GPUs", Juan Gómez-Luna, I-Jui Sung, Li-Wen Chang, José María González-Linares, Nicolás Guil, Wen-mei Hwu, Parallel and Distributed Systems, IEEE Transactions on, Volume 27, Issue 3, Mar. 2015, p.p 776 - 788. [more...]

	"Locality-Centric Thread Scheduling for Bulk-synchronous Programming Models on CPU Architectures", Hee-Seok Kim, Izzat El Hajj, John A. Stratton, Steve S. Lumetta, Wen-mei Hwu, International Symposium on Code Generation and Optimization (CGO). (Best Paper Award Nominee) [more...]

	"GPU-SM: Shared Memory Multi-GPU Programming", Javier Cabezas, Marc Jordà, Isaac Gelado, Nacho Navarro, Wen-mei Hwu, GPGPU 8. [more...]

	"Tangram: a High-level Language for Performance Portable Code Synthesis", Li-Wen Chang, Abdul Dakkak, Christopher I. Rodrigues, Wen-mei Hwu, the Eighth Workshop on Programmability Issues for Heterogeneous Multicores (MULTIPROG-2015). [more...]

2014

	"Adaptive Cache Management for Energy-efficient GPU Computing", Xuhao Chen, Li-Wen Chang, Christopher I. Rodrigues, Jie Lv, Zhiying Wang, Wen-mei Hwu, Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, December 2014. [more...]

	"Supporting High-Level, High-Performance Parallel Programming with Library-Driven Optimization", Christopher I. Rodrigues, University of Illinois Doctoral Disertation, May 2014. [more...]

	"Automatic execution of single-GPU computations across multiple GPUs", Javier Cabezas, Lluis Vilanova, Isaac Gelado, Thomas B. Jablin, Nacho Navarro, Wen-mei Hwu, Proceedings of the 23rd international conference on Parallel architectures and compilation (PACT '14). [more...]

	"High Performance Histogramming on Massively Parallel Processors", Greg Ross, University of Illinois Masters Disertation, August 2014. [more...]

	"Scalable Parallel Tridiagonal Algorithms with Diagonal Pivoting and Their Optimization for Many-core Architectures", Li-Wen Chang, University of Illinois Master Thesis, July 2014 . [more...]

	"Adaptive Cache Bypass and Insertion for Many-core Accelerators", Xuhao Chen, Shengzhao Wu, Li-Wen Chang, Wei-Sheng Huang, Carl Pearson, Wen-mei Hwu, Proceedings of International Workshop on Manycore Embedded Systems, 2014. [more...]

	"A guide for implementing tridiagonal solvers on GPUs", Li-Wen Chang, Wen-mei Hwu, Numerical Computations with GPUs. [more...]

	"Dynamic Loop Vectorization for Executing OpenCL Kernels on CPUs", Izzat El Hajj, University of Illinois Masters Thesis, May 2014. [more...]

	"Runtime and Architecture Support for Efficient Data Exchange in Multi-Accelerator Applications", Javier Cabezas, Isaac Gelado, John E. Stone, Nacho Navarro, David Kirk, Wen-mei Hwu, IEEE Transactions on Parallel and Distributed Systems, Issue:99. [more...]

	"Triolet: A Programming System that Unifies Algorithmic Skeleton Interfaces for High-Performance Cluster Computing", Christopher I. Rodrigues, Thomas B. Jablin, Abdul Dakkak, Wen-mei Hwu, Proceedings of the 2014 ACM SIGPLAN Conference on Principles and Practice of Parallel Programing, February 2014. [more...]

	"Multi-tier Dynamic Vectorization for Translating GPU Optimizations into CPU Performance", Hee-Seok Kim, Izzat El Hajj, John A. Stratton, Wen-mei Hwu, IMPACT Technical Report, IMPACT-14-01, University of Illinois at Urbana-Champaign, Center for Reliable and High-Performance Computing, February 5, 2014. [more...]

	"In-place transposition of rectangular matrices on accelerators", I-Jui Sung, Juan Gómez-Luna, José María González-Linares, Nicolás Guil, Wen-mei Hwu, PPoPP '14 Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming. [more...]

	"BLESS: Bloom filter-based error correction solution for high-throughput sequencing reads", Yun Heo, Xiao-Long Wu, Deming Chen, Jian Ma, Wen-mei Hwu, Baastian Aarts, Advance Access Publication . [more...]

	"High-Performance Parallel Programming Framework Using Template-Based Static Optimization", Shengzhao Wu, MS Thesis. University of Illinois at Urbana-Champaign, 2014.. [more...]

2013

	"Mapping Tridiagonal Solvers to Linear Recurrences", Li-Wen Chang, Wen-mei Hwu, IMPACT Technical Report, IMPACT-13-01, University of Illinois at Urbana-Champaign, Center for Reliable and High-Performance Computing, Sept. 8, 2013. [more...]

	"Throughput-Oriented Kernel Porting onto FPGAs", Alexandros Papakonstantinou, Deming Chen, Wen-mei Hwu, Jason Cong, Yun Liang, Proceedings of the 50th Annual Design Automation Conference, May 2013. [more...]

	"Performance Portability of Parallel Kernels on Shared-Memory Systems", John A. Stratton, University of Illinois at Urbana-Champaign, 2013.. [more...]

	"Data Layout Transformation Through In-Place Transposition", I-Jui Sung, Diss. University of Illinois at Urbana-Champaign, 2013.. [more...]

	"Performance Portability in Accelerated Parallel Kernels", John A. Stratton, Hee-Seok Kim, Thomas B. Jablin, Wen-mei Hwu, IMPACT Technical Report, IMPACT-13-01, University of Illinois at Urbana-Champaign, Center for Reliable and High-Performance Computing, May 18, 2012. [more...]

	"Real-time in vivo computed optical interferometric tomography", Adeel Ahmad, Nathan Shemonski, Steven Adie, Hee-Seok Kim, Wen-mei Hwu, Scott Carney, Stephen Boppart, Nature Photonics. [more...]

	"Comparison Based Sorting for Systems with Multiple GPUs", Ivan Tanasic, Lluis Vilanova, Marc Jordà, Javier Cabezas, Isaac Gelado, Nacho Navarro, Wen-mei Hwu, In GPGPU-6 , Six Workshop on General Purpose Processing Using GPUs, Mar 2013, ISBN: 978-1-4503-2017-7. [more...]

	"Rapid Computation of Sodium Bioscales Using GPU-Accelerated Image Reconstruction", Ian C. Atkinson, Geng Liu, Nady Obeid, Keith R. Thulborn, Wen-mei Hwu, International Journal of Imaging Systems and Technology, Volume 23, Issue 1, pages 29-35, March 2013. [more...]

	"More IMPATIENT: A gridding-accelerated Toeplitz-based strategy for non-Cartesian high-resolution 3D MRI on GPUs", Jiading Gai, Nady Obeid, Joseph L. Holtrop, Xiao-Long Wu, Fan Lam, Maojing Fu, Justin P. Haldar, Wen-mei Hwu, Zhi-Pei Liang, Bradley P. Sutton, Journal of Parallel and Distributed Computing, 16 January 2013. [more...]

2012

	"TIGER: Tiled iterative genome assembler", Xiao-Long Wu, Yun Heo, Izzat El Hajj, Wen-mei Hwu, Deming Chen, Jian Ma, Journal of BMC Bioinformatics, 2012. [more...]

	"Design evaluation of OpenCL compiler framework for Coarse-Grained Reconfigurable Arrays", Hee-Seok Kim, Minwook Ahn, John A. Stratton, Wen-mei Hwu, Proceedings of the Field-Programmable Technology (FPT) International Conference, Dec 2012. [more...]

	"High-level Automation of Custom Hardware Design for High-performance Computing", Alexandros Papakonstantinou, University of Illinois at Urbana-Champaign, 2012.. [more...]

	"A Scalable, Numerically Stable, High-performance Tridiagonal Solver using GPUs", Li-Wen Chang, John A. Stratton, Hee-Seok Kim, Wen-mei Hwu, Proceedings of the International Conference for High Performance Computing, Networking Storage and Analysis, 2012. [more...]

	"Performance Analysis and Tuning for General Purpose Graphics Processing Units (GPGPU)", Hyesoon Kim, Richard Vuduc, Sara Sadeghi Baghsorkhi, Jee Choi, Wen-mei Hwu, Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers 2012. [more...]

	"Unboxed Polymorphic Objects for Functional Numerical Programming", Christopher I. Rodrigues, Wen-mei Hwu, IMPACT Technical Report IMPACT-12-02, University of Illinois at Urbana-Champaign, 2012. [more...]

	"Algorithm and Data Optimization Techniques for Scaling to Massively Threaded Systems", John A. Stratton, Christopher I. Rodrigues, I-Jui Sung, Li-Wen Chang, Nasser Anssari, Daniel Liu, Wen-mei Hwu, IEEE Computer, vol. 45, no. 8, pp. 26-32, Aug. 2012. [more...]

	"Optimization and Architecure Effects on GPU Computing Workload Performance", John A. Stratton, Nasser Anssari, Christopher I. Rodrigues, I-Jui Sung, Nady Obeid, Li-Wen Chang, Daniel Liu, Wen-mei Hwu, Proceedings of the IEEE Conference on Innovative Parallel Computing, May 2012. [more...]

	"DL: Data Layout Transformation System for Heterogeneous Computing", I-Jui Sung, Daniel Liu, Wen-mei Hwu, IEEE Innovative Parallel Computing (InPar 2012), San Jose, CA, May 13--14, 2012. [more...]

	"More IMPATIENT : A Gridding - Accelerated Toeplitz - based S trategy for Non - Cartesian High - Resolution 3D MRI on GPU", Jiading Gai, Joseph L. Holtrop, Xiao-Long Wu, Fan Lam, Maojing Fu, Justin P. Haldar, Wen-mei Hwu, Zhi-Pei Liang, Bradley P. Sutton, Proceedings of the International Society for Magnetic Resonance in Medicine (ISMRM), May 2012. [more...]

	"Parboil: A Revised Benchmark Suite for Scientific and Commercial Throughput Computing", John A. Stratton, Christopher I. Rodrigues, I-Jui Sung, Nady Obeid, Li-Wen Chang, Nasser Anssari, Daniel Liu, Wen-mei Hwu, IMPACT Technical Report, IMPACT-12-01, University of Illinois at Urbana-Champaign, Center for Reliable and High-Performance Computing, March 2, 2012. (Paper of IMPACT - Cited Greater than 300 Times) [more...]

	"Efficient Performance Evaluation of Memory Hierarchy for Highly Multithreaded Graphics Processors", Sara Sadeghi Baghsorkhi, Isaac Gelado, Matthieu Delahaye, Wen-mei Hwu, Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming, February, 2012. [more...]

	"Input Division for Binary Translation", Daniel Liu, "Input division for binary translation." (2012).. [more...]

2011

	"A Tiling-Scheme Viterbi Decoder in Software-Defined Radio for GPUs", Chih-Sheng Lin, Wei-Lun Liu, Wei-Ting Yeh, Li-Wen Chang, Wen-mei Hwu, Sao-Jie Chen, Pao-Ann Hsiung, Proceedings of the 7th International Conference on Wireless Communications, Networking and Mobile Computing (WiCOM), 2011. [more...]

	"Scalable SIMD-parallel memory allocation for many-core machines", Victor Huang, Christopher I. Rodrigues, Stephen Jones, Ian Buck, Wen-mei Hwu, The Journal of Supercomputing, 9 Sep 2011. [more...]

	"Scalable Tridiagonal Solver for GPUs", Hee-Seok Kim, Shengzhao Wu, Li-Wen Chang, Wen-mei Hwu, Proceedings of the International Conference on Parallel Processing, September 2011. [more...]

	"IMPATIENT MRI: Illinois Massively Parallel Acceleration Toolkit for Image Reconstruction with ENhanced Throughput in MRI", Xiao-Long Wu, Jiading Gai, Fan Lam, Maojing Fu, Justin P. Haldar, Yue Zhuo, Zhi-Pei Liang, Wen-mei Hwu, Bradley P. Sutton, Proceedings of the International Society for Magnetic Resonance in Medicine (ISMRM), May 2011. [more...]

	"Parallel Implementation of Multi-Dimensional Ensemble Empirical Mode Decomposition", Li-Wen Chang, Men-Tzung Lo, Nasser Anssari, Ke-Hsin Hsu, Norden E. Huang, Wen-mei Hwu, Proceedings of the IEEE Conference on Acoustics, Speech and Signal Processing, May 2011. [more...]

	"Automatic Translation of CUDE to OpenCL and Comparison of Performance Optimizations on GPUs", Deepthi Nandakumar, MS Thesis. University of Illinois at Urbana-Champaign. (2011).. [more...]

	"Multilevel Granularity Parallelism Synthesis on FPGAs", Alexandros Papakonstantinou, Yun Liang, John A. Stratton, Karthik Gururaj, Deming Chen, Wen-mei Hwu, Jason Cong, Proceedings of the 2011 International Symposium on Field-Programmable Custom Computing Machines (FCCM), May 2011. (Best Paper Award from FCCM 2011) [more...]

	"Auto-tuning of Fast Fourier Transform on Graphics Processors", Yuri Dotsenko, Sara Sadeghi Baghsorkhi, Brandon Lloyd, Naga Govindaraju, Proceedings of the 16th ACM SIGPLAN Symposium on Principles and Practice of Paral lel Programming (PPoPP), Feb. 2011. [more...]

	"Advanced MRI Reconstruction Toolbox with Accelerating on GPUs", Xiao-Long Wu, Yue Zhuo, Jiading Gai, Fan Lam, Maojing Fu, Justin P. Haldar, Wen-mei Hwu, Zhi-Pei Liang, Bradley P. Sutton, Proceedings of the IS&T/SPIE Electronic Imaging 2011 Conference on "Parallel Processing for Imaging Applications", January 2011. [more...]

	"Compact Binning for Parallep Processing of Limited-Range Functions", Nady Obeid, Diss. Univesity of Illinois at Urbana-Champaign. 2011.. [more...]

2010

	"Sparse Regularization in MRI Iterative Reconstruction using GPUs", Yue Zhuo, Bradley P. Sutton, Xiao-Long Wu, Justin P. Haldar, Wen-mei Hwu, Zhi-Pei Liang, Proceedings of the 3rd International Conference on BioMedical Engineering and Informatics (BMEI'10), October 2010. [more...]

	"Exploiting More Parallelism from Applications Having Generalized Reductions on GPU Architectures", Xiao-Long Wu, Nady Obeid, Wen-mei Hwu, Proceedings of the 10th IEEE International Conference on Computer and Information Technology (CIT 2010), pp.1175-1180, June 2010. [more...]

	"Data Layout Transformation Exploiting Memory-Level Parallelism in Structured Grid Many-Core Applications", I-Jui Sung, John A. Stratton, Wen-mei Hwu, Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques (PACT), Vienna, Austria, September 11-15, 2010. [more...]

	"An Effective GPU Implementation of Breadth-First Search", Lijuan Luo, Martin Wong, Wen-mei Hwu, Proceedings of the 47th Design Automation Conference, 2010. (Paper of IMPACT - Cited Greater than 150 Times) [more...]

	"Implementing a GPU Programming Model on a non-GPU Accelerator Architecture", Stephen M. Kofsky, Daniel R. Johnson, John A. Stratton, Wen-mei Hwu, Sanjay J. Patel, Steve S. Lumetta, Proceedings of the Workshop on Applications for Multi- and Many-cores, June 2010. [more...]

	"XMalloc: A Scalable Lock-free Dynamic Memory Allocator for Many-core Machines", Victor Huang, Christopher I. Rodrigues, Stephen Jones, Ian Buck, Wen-mei Hwu, Proceedings of the 10th IEEE International Conference on Computer and Information Technology (CIT 2010), pp.1134-1139, June 2010. (Best Paper Award from CIT 2010) [more...]

	"Multi-GPU Implementation for Iterative MR Image Reconstruction with Field Correction", Yue Zhuo, Xiao-Long Wu, Justin P. Haldar, Wen-mei Hwu, Zhi-Pei Liang, Bradley P. Sutton, Proceedings of International Society for Magnetic Resonance in Medicine (ISMRM) 2010. [more...]

	"Accelerating Iterative Field-Compensated MR Image Reconstruction on GPUs", Yue Zhuo, Xiao-Long Wu, Justin P. Haldar, Wen-mei Hwu, Zhi-Pei Liang, Bradley P. Sutton, Proceedings of the IEEE International Symposium on Biomedical Imaging(ISBI), April, 2010. [more...]

	"Efficient Compilation of Fine-grained SPMD-threaded Programs for Multicore CPUs", John A. Stratton, Vinod Grover, Jaydeep Marathe, Baastian Aarts, Mike Murphy, Ziang Hu, Wen-mei Hwu, Proceedings of the International Symposium on Code Generation and Optimization, April 2010. [more...]

	"An Asymmetric Distributed Shared Memory Model for Heterogeneous Parallel Systems", Isaac Gelado, Javier Cabezas, Nacho Navarro, John E. Stone, Sanjay J. Patel, Wen-mei Hwu, The ACM/IEEE 15th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS'10), Pittsburgh, PA., March 13 - 17, 2010 . (Paper of IMPACT - Cited Greater than 150 Times) [more...]

	"An Adaptive Performance Modeling Tool for GPU Architectures", Sara Sadeghi Baghsorkhi, Matthieu Delahaye, Sanjay J. Patel, William D. Gropp, Wen-mei Hwu, Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), Jan. 2010. (Paper of IMPACT - Cited Greater Than 250 Times) [more...]

	"Data Layout Transformation for Structured-Grid Codes on GPU", I-Jui Sung, Wen-mei Hwu, Workshop on Language, Compiler, and Architecture Support for GPGPU, in conjunction with PPoPP 2010. [more...]

2009

	"The parallelization of video processing", Dennis Lin, Victor Huang, Quang Nguyen, Joshua Blackburn, Christopher I. Rodrigues, Thomas Huang, Minh N. Do, Sanjay J. Patel, Wen-mei Hwu, IEEE Signal Processing Magazine 26(6), 103--112, 2009. [more...]

	"GPU clusters for high-performance computing", Volodymyr V. Kindratenko, Jeremy J. Enos, Guochun Shi, Michael T. Showerman, Galen W. Arnold, John E. Stone, James C. Phillips, Wen-mei Hwu, Cluster Computing and Workshops, 2009. CLUSTER'09. IEEE International Conference on, 2009. (Paper of IMPACT - Cited Greater than 200 Times)) [more...]

	"FCUDA: Enabling Efficient Compilation of CUDA Kernels onto FPGAs", Alexandros Papakonstantinou, Karthik Gururaj, John A. Stratton, Deming Chen, Jason Cong, Wen-mei Hwu, (Best Paper Award), Symposium on Application Specific Processors, July 2009. (Best Paper Award, Paper of IMPACT - Cited Greater than 150 Times) [more...]

	"Compute Unified Device Architecture Application Suitability", Wen-mei Hwu, Christopher I. Rodrigues, Shane Ryoo, John A. Stratton, Computing in Science and Engineering Vol. 11 No. 3, May 2009. [more...]

	"Analytical Performance Prediction for Evaluation and Tuning of GPGPU Applications.", Sara Sadeghi Baghsorkhi, Matthieu Delahaye, William D. Gropp, Wen-mei Hwu, In the Workshop on Exploiting Parallelism Using GPUs and Other Hardware-Assisted Methods (associated with CGO '09), March 2009 . [more...]

	"QP: a heterogeneous multi-accelerator cluster", Michael T. Showerman, Jeremy J. Enos, Avneesh Pant, Volodymyr V. Kindratenko, Craig Steffen, Robert Pennington, Wen-mei Hwu, Proc. 10th LCI International Conference on High-Performance Clustered Computing, 2009. (Paper of IMPACT - Cited Greater than 100 Times) [more...]

	"High performance computation and interactive display of molecular orbitals on GPUs and multi-core CPUs", John E. Stone, Jan Saam, David J. Hardy, Kirby L. Vandivort, Wen-mei Hwu, Klaus Schulten, Proceeding GPGPU-2 Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, 2009. [more...]

2008

	"Program Optimization Strategies for Data-Parallel Many-Core Processors.", Shane Ryoo, PhD Dissertation, Department of Electrical and Computer Engineering, University of Illinois, Urbana, IL, 2008 . [more...]

	"The Concurrency Challenge", Wen-mei Hwu, Timothy G. Mattson, Kurt Keutzer, Design & Test of Computers, IEEE, 2008. [more...]

	"CUDA-lite: Reducing GPU Programming Complexity.", Sain-Zee Ueng, Melvin Lathara, Sara Sadeghi Baghsorkhi, Wen-mei Hwu, The 21st International Workshop on Languages and Compilers for Parallel Computing, LNCS 5335, pp. 1-15, 2008 . (Paper of IMPACT - Cited Greater Than 250 Times) [more...]

	"MCUDA: An Efficient Implementation of CUDA Kernels for Multi-Core CPUs", John A. Stratton, Sam S. Stone, Wen-mei Hwu, 21st International Workshop on Languages and Compilers for Parallel Computing, LNCS 5335, pp. 16-30, 2008 . (Paper of IMPACT - Cited Greater Than 250 Times) [more...]

	"Analyses for Extensive Parallelization of Video Applications in C.", Shane Ryoo, Wen-mei Hwu, IMPACT Technical Report, IMPACT-08-02, University of Illinois at Urbana-Champaign, Urbana, IL, June 2008. [more...]

	"CUBA: An Architecture for Efficient CPU/Co-processor Data Communication.", Isaac Gelado, John H. Kelm, Shane Ryoo, Steve S. Lumetta, Nacho Navarro, Wen-mei Hwu, Proceedings of the 22nd ACM International Conference on Supercomputing, June 2008. [more...]

	"Accelerating Advanced MRI Reconstructions on GPUs.", Sam S. Stone, Justin P. Haldar, Stephanie Tsao, Wen-mei Hwu, Zhi-Pei Liang, Bradley P. Sutton, Proceedings of the 2008 International Conference on Computing Frontiers, May 2008 . (Paper of IMPACT - Cited Greater Than 250 Times) [more...]

	"GPU Acceleration of Cutoff Pair Potential for Molecular Modeling Applications", Christopher I. Rodrigues, David J. Hardy, John E. Stone, Klaus Schulten, Wen-mei Hwu, Proceedings of the 2008 International Conference on Computing Frontiers, May 2008. (Paper of IMPACT - Cited Greater than 100 Times) [more...]

	"Program Optimization Space Pruning for a Multithreaded GPU", Shane Ryoo, Christopher I. Rodrigues, Sam S. Stone, Sara Sadeghi Baghsorkhi, Sain-Zee Ueng, John A. Stratton, Wen-mei Hwu, Proceedings of the 2008 International Symposium on Code Generation and Optimization, April 2008. (Paper of IMPACT - Cited Greater Than 250 Times) [more...]

	"MCUDA: An Efficient Implementation of CUDA Kernels on Multi-cores.", John A. Stratton, Sam S. Stone, Wen-mei Hwu, IMPACT Technical Report, IMPACT-08-01, University of Illinois, Urbana, IL, 2008 . [more...]

	"Optimization Principles and Application Performance Evaluation of a Multithreaded GPU Using CUDA", Shane Ryoo, Christopher I. Rodrigues, Sam S. Stone, John A. Stratton, Sain-Zee Ueng, Sara Sadeghi Baghsorkhi, Wen-mei Hwu, Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, February 2008. (Paper of IMPACT - Cited Greater Than 950 Times) [more...]

	"Program Optimization Carving for GPU Computing.", Shane Ryoo, Christopher I. Rodrigues, John A. Stratton, Sam S. Stone, Sain-Zee Ueng, Sara Sadeghi Baghsorkhi, Wen-mei Hwu, The Special Issue of the Journal of Parallel and Distributed Computing on General Purpose Parallel Processing Using GPUs, 2008. (Paper of IMPACT - Cited Greater than 100 Times) [more...]

2007

	"Iteration Disambiguation for Parallelism Identification in Time-Sliced Applications", Shane Ryoo, Christopher I. Rodrigues, Wen-mei Hwu, The 20th International Workshop on Languages and Compilers for Parallel Computing, LNCS 5234, October 2007. [more...]

	"How GPUs Can Improve the Quality of Magnetic Resonance Imaging.", Sam S. Stone, Haoran Yi, Justin P. Haldar, Wen-mei Hwu, Bradley P. Sutton, Zhi-Pei Liang, The First Workshop on General Purpose Processing on Graphics Processing Units, October 2007. [more...]

	"Program Optimization Study on a 128-Core GPU.", Shane Ryoo, Christopher I. Rodrigues, Sam S. Stone, Sara Sadeghi Baghsorkhi, Sain-Zee Ueng, Wen-mei Hwu, Xuan Yu, The First Workshop on General Purpose Processing on Graphics Processing Units, October 2007. [more...]

	"CIGAR: Application Partitioning for a CPU/Coprocessor Architecture", John H. Kelm, Isaac Gelado, Mark J. Murphy, Nacho Navarro, Steve S. Lumetta, Wen-mei Hwu, In The 2007 International Conference on Parallel Architectures and Compilation Techniques (PACT 2007), pp. 317-326, Sep 2007, ISBN 978-0-7695-2944-8. [more...]

	"Performance Insights on Executing Non-Graphics Applications on CUDA on the NVIDIA GeForce 8800 GTX.", Wen-mei Hwu, David Kirk, Shane Ryoo, Christopher I. Rodrigues, John A. Stratton, Kuangwei Huang, Presentation at Hot Chips 19, August 2007. [more...]

	"Automatic Discovery of Coarse-Grained Parallelism in Media Applications", Shane Ryoo, Sain-Zee Ueng, Christopher I. Rodrigues, Robert E. Kidd, Matthew I. Frank, Wen-mei Hwu, Transactions on HiPEAC I, LNCS 4050, pp. 194-213, 2007. [more...]

	"Implicit Parallel Programming Models for Thousand-Core Microprocessors.", Wen-mei Hwu, Shane Ryoo, Sain-Zee Ueng, John H. Kelm, Isaac Gelado, Sam S. Stone, Robert E. Kidd, Sara Sadeghi Baghsorkhi, Aqeel A. Mahesri, Stephanie Tsao, Nacho Navarro, Steve S. Lumetta, Matthew I. Frank, Sanjay J. Patel, Proceedings of the 44th Annual Design Automation Conference, June 2007. [more...]

	"Dynamic Tracking of Information-Flow Signatures for Security Checking", William Healey, Karthik Pattabiraman, Shane Ryoo, Ravishanker Iyer, Wen-mei Hwu, Technical Report UILU-ENG-02-2002, University of Illinois at Urbana-Champaign, January 2007. [more...]

	"Multiversioning in the Store Queue is the Root of All Store-Forwarding Evil", Sam S. Stone, Diss. University of Illinois at Urbana-Champaign, 2007.. [more...]

2006

	"Improved Superblock Optimization in GCC.", Robert E. Kidd, Wen-mei Hwu, Proceedings of the GCC Developer's Summit, pp. 85-96, June 2006. [more...]

	"P3DE: Profile-Directed Predicated Partial Dead Code Elimination.", Shane Ryoo, Sain-Zee Ueng, Wen-mei Hwu, The 5th Workshop on EPIC Architectures and Compiler Technology, March 2006. [more...]

	"Tolerating Cache-Miss Latency With Multipass Pipelines.", Ronald D. Barnes, Shane Ryoo, Wen-mei Hwu, IEEE Micro, Vol. 26, No. 1, January-February 2006. [more...]

	"On Extracting Coarse-Grained Function Parallelism From C Programs", Chien-wei Li, PhD Thesis. Computer Science. University of Illinois, 2006.. [more...]

2005

	"FULCRA Pointer Analysis Framework.", Erik M. Nystrom, PhD thesis, Department of Electrical and Computer Engineering, University of Illinois, Urbana IL, 2005 . [more...]

	"A Systematic Approach to Delivering Instruction-Level Parallelism in EPIC Systems.", John W. Sias, PhD. Dissertation, Department of Electrical and Computer Engineering, University of Illinois, Urbana IL, 2005. [more...]

	"Multiple-Pass Pipelining: Enhancing in-order Microarchitectures to Out-Of-Order Performance.", Ronald D. Barnes, PhD thesis, Department of Electrical and Computer Engineering, University of Illinois, Urbana IL, 2005. [more...]

	""Flea-flicker" Multipass Pipelining: An Alternative to the High-Power Out-of-Order Offense.", Ronald D. Barnes, Shane Ryoo, Wen-mei Hwu, Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture, November 2005. [more...]

	"An Evaluation of Low-Overhead Partial Flow Sensitivity", James Player, MS Thesis. [more...]

	"Future Compilation Requirements for Emerging Driving General Purpose Applications.", Ian Steiner, Diss. University of Illinois at Urbana-Champaign. (2005).. [more...]

	"Trimaran: An infrastructure for research in instruction-level parallelism", Lakshmi N. Chakrapani, John C. Gyllenhaal, Wen-mei Hwu, Scott A. Mahlke, Krishna V. Palem, Rodric M. Rabbah, .. [more...]

2004

	"Matching On-Chip Data Storage To Telecommunication And Media Application Properties.", Hillery C. Hunter, PhD thesis, Department of Electrical and Computer Engineering, University of Illinois, Urbana IL, 2004. [more...]

	"Partial Code Elimination in the IMPACT Compiler Framework.", Shane Ryoo, MS Thesis, Department of Electrical and Computer Engineering, University of Illinois, Urbana, IL, 2004. [more...]

	"Template Bundling for EPIC Architectures", Sain-Zee Ueng, MS thesis, Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, 2004. [more...]

	"Applying Scalable Interprocedural Pointer Analysis to Embedded Applications.", Hillery C. Hunter, Erik M. Nystrom, Wen-mei Hwu, Workshop on Compilers and Tools for Constrained Embedded Systems, September 2004. [more...]

	"Bottom-up and Top-down Context-Sensitive Summary-based Pointer Analysis.", Erik M. Nystrom, Hong-Seok Kim, Wen-mei Hwu, Proceedings of the 11th Static Analysis Symposium, August 2004. [more...]

	"Exploiting Load Flexibility for Embedded Power Savings.", Hillery C. Hunter, Shane Ryoo, James Player, Daniel A. Connors, Wen-mei Hwu, IMPACT Technical Report, IMPACT-04-01, University of Illinois, at Urbana-Champaign, June 2004. [more...]

	"Field-testing IMPACT EPIC Research Results in Itanium 2.", John W. Sias, Sain-Zee Ueng, Geoffrey A. Kent, Ian Steiner, Erik M. Nystrom, Wen-mei Hwu, Proceedings of the 31st Annual International Symposium on Computer Architecture, pp. 26-37, July 2004. [more...]

	"Importance of Heap Specialization in Pointer Analysis.", Erik M. Nystrom, Hong-Seok Kim, Wen-mei Hwu, Proceedings of Program Analysis for Software Tools and Engineering, June 2004. [more...]

	"Extracting Data Flow Model from von Neumann Program for Synthesis", Chien-wei Li, Hong-Seok Kim, Wen-mei Hwu, Proceedings of the 13th International Workshop on Logic and Synthesis, June 2004. [more...]

2003

	"A Dynamic Application Analysis Framework.", Marie T. Conte, PhD thesis, Department of Electrical and Computer Engineering, University of Illinois, at Urbana-Champaign, 2003. [more...]

	"Beating in-order stalls with "flea-flicker" two-pass pipelining", Ronald D. Barnes, Erik M. Nystrom, John W. Sias, Sanjay J. Patel, Nacho Navarro, Wen-mei Hwu, Proceeding MICRO 36 Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture, 2003. [more...]

	"A New Look at Exploiting Data Parallelism in Embedded Systems.", Hillery C. Hunter, Jaime H. Moreno, Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, October 2003 . [more...]

	"Memory Profiling: Expanding the 3G Developer's Bag of Tricks.", Hillery C. Hunter, Wen-mei Hwu, Workshop on Compilers and Tools for Constrained Embedded Systems, October 2003. [more...]

	"Motivating use of Memory Profiling in the 3G Domain.", Hillery C. Hunter, Chien-wei Li, Wen-mei Hwu, Proceedings of the SRC TECHCON 2003, August 2003. [more...]

	"An Innovative Low-Power High-Performance Programmable Signal Processor for Digital Communications.", Jaime H. Moreno, Victor V. Zyuban, Uzi Shvadron, Fredy D. Neeser, Jeffrey H Derby, Malcolm S. Ware, Krishnan Kailas, Ayal Zaks, A. Geva, Shay Ben-David, Sameh W. Asaad, Thomas W. Fox, D. Littrell, Marina Biberstein, Dorit Naishlos, Hillery C. Hunter, IBM Journal of Research and Development, vol. 47, no 2/3, March/May 2003. [more...]

	"Compaction algorithm for precise modular context-sensitive pointer analysis.", Hong-Seok Kim, Erik M. Nystrom, Ronald D. Barnes, Wen-mei Hwu, IMPACT Technical Report, IMPACT-03-03, University of Illinois, Urbana, IL, 2003. [more...]

	"Scalable, precise context-sensitive top-down process for modular points-to analysis.", Erik M. Nystrom, Hong-Seok Kim, Wen-mei Hwu, IMPACT Technical Report, IMPACT-03-03, University of Illinois, Urbana, IL, 2003. [more...]

2002

	"The IMPACT SC140 Code Generator.", Christopher J. Shannon, MS thesis, Department of Electrical and Computer Engineering, University of Illinois, Urbana IL, April 2002. [more...]

	"Vacuum Packing: Extracting Hardware-Detected Program Phases for Post-link Optimization.", Ronald D. Barnes, Erik M. Nystrom, Matthew C. Merten, Wen-mei Hwu, Proceedings of the 35th International Symposium on Microarchitecture, November 2002. [more...]

	"Code Coverage and Input Variability: Effects on Architecture and Compiler Research.", Hillery C. Hunter, Wen-mei Hwu, Proceedings of the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, October, 2002. [more...]

2001

	"Enhancing Loop Buffering of Media and Telecommunications Applications Using Low-overhead Predication.", John W. Sias, Matthew C. Merten, Erik M. Nystrom, Ronald D. Barnes, Christopher J. Shannon, Joseph D. Matarazzo, Shane Ryoo, Jeff V. Olivier, Wen-mei Hwu, Proceedings of the 34th International Symposium on Microarchitecture, December, 2001. [more...]

	"Program Decision Logic Optimization Using Predication and Control Speculation", Wen-mei Hwu, David I. August, John W. Sias, Proceedings of the IEEE, November, 2001. [more...]

	"A study of the energy saving and capacity improvement potential of power control in multi-hop wireless networks", Jeffrey P. Monks, J.-P. Ebert, Adam Wolisz, Wen-mei Hwu, Proceedings. LCN 2001. 26th Annual IEEE Conference on Local Computer Networks, 2001. [more...]

	"Itanium Performance Insights.", Wen-mei Hwu, John W. Sias, Matthew C. Merten, Erik M. Nystrom, Ronald D. Barnes, Christopher J. Shannon, Shane Ryoo, Jeff V. Olivier, Presentation at Microprocessor Forum, October 2001. [more...]

	"Modulo Schedule Buffers.", Matthew C. Merten, Wen-mei Hwu, Proceedings of the 34th International Symposium on Microarchitecture, December, 2001. [more...]

	"Code Reordering and Speculation Support for Dynamic Optimization Systems.", Erik M. Nystrom, Ronald D. Barnes, Matthew C. Merten, Wen-mei Hwu, Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, September 8-12, 2001. [more...]

	"Itanium Performance Insights from the IMPACT Compiler.", John W. Sias, Matthew C. Merten, Erik M. Nystrom, Ronald D. Barnes, Christopher J. Shannon, Joseph D. Matarazzo, Shane Ryoo, Jeff V. Olivier, Wen-mei Hwu, Presentation at Hot Chips 13, August 2001. [more...]

	"Characterization of Repeating Data Access Patterns in Integer Benchmarks.", Erik M. Nystrom, Roy Dz-ching Ju, Wen-mei Hwu, Memory Performance Issues Workshop at the 28th International Symposium on Computer Architecture, July 2001. [more...]

	"An Architectural Framework for Run-Time Optimization.", Matthew C. Merten, Andrew R. Trick, Ronald D. Barnes, Erik M. Nystrom, Christopher N. George, John C. Gyllenhaal, Wen-mei Hwu, IEEE Transactions on Computers, Vol. 50, No. 6, pp. 567-589, June 2001. [more...]

	"Transmission Power Control for Enhancing the Performance of Wireless Packet Data Networks", Jeffrey P. Monks, Diss. University of Illinois at Urbana-Champaign, 2001.. [more...]

2000

	"Accurate and Efficient Predicate Analysis with Binary Decision Diagrams.", John W. Sias, Wen-mei Hwu, David I. August, Proceedings of the 33rd International Symposium on Microarchitecture, December, 2000. [more...]

	"Hardware Support for Dynamic Activation of Compiler-Directed Computation Reuse.", Daniel A. Connors, Hillery C. Hunter, Ben-Chung Cheng, Wen-mei Hwu, Proceedings of the 9th International Conference on Architecture Support for Programming Languages and Operating Systems, November 2000. [more...]

	"Modular Interprocedural Pointer Analysis Using Access Paths: Design, Implementation, and Evaluation.", Ben-Chung Cheng, Wen-mei Hwu, Proceedings of the 2000 ACM SIGPLAN Conference on Programming Language Design and Implementation, Vancouver, British Columbia, Canada, June, 2000. (Paper of IMPACT - Cited Greater Than 150 Times) [more...]

	"A Hardware Mechanism for Dynamic Extraction and Relayout of Program Hot Spots.", Matthew C. Merten, Andrew R. Trick, Erik M. Nystrom, Ronald D. Barnes, Wen-mei Hwu, Proceedings of the 27th International Symposium on Computer Architecture, pp. 59-70, June 2000. [more...]

	"Compile-Time Memory Disambiguation for C Programs.", Ben-Chung Cheng, PhD thesis, Department of Computer Science, University of Illinois, Urbana, IL, May 2000. [more...]

	"Eliminating Dynamic Computation Redundancy", Daniel A. Connors, Ph.D. dissertation, Department of Electrical and Computer Engineering, University of Illinois, Urbana IL, May 2000. [more...]

	"Systematic Compilation for Predicated Execution.", David I. August, Ph.D. dissertation, Department of Electrical and Computer Engineering, University of Illinois, Urbana IL, Feb. 2000. [more...]

	"Interactive Source-Level Debugging of Optimized Code", Le-Chun Wu, Diss. 2000.. [more...]

1999

	"Condition Awareness Support for Predicate Analysis and Optimization.", John W. Sias, MS thesis, Department of Electrical and Computer Engineering, University of Illinois, Urbana IL, 1999. [more...]

	"Run-Time Cache Bypassing.", Teresa L. Johnson, Daniel A. Connors, Matthew C. Merten, Wen-mei Hwu, IEEE Transactions on Computers, Vol. 48, No. 12, pp. 1338-1354, December 1999. (Paper of IMPACT - Cited Greater than 100 Times) [more...]

	"Compiler-Directed Dynamic Computation Reuse: Rationale and Initial Results.", Daniel A. Connors, Wen-mei Hwu, Proceedings of the 32nd International Symposium on Microarchitecture, November, 1999. [more...]

	"Feedback-Directed Data Cache Optimizations for the x86.", Ronald D. Barnes, Proceedings of the 2nd ACM Workshop on Feedback-Directed Optimization, November 1999. [more...]

	"A Framework for Profile-Driven Optimization in the IMPACT Binary Reoptimization System.", Matthew C. Merten, MS thesis, Department of Electrical and Computer Engineering, University of Illinois, Urbana IL, June 1999. [more...]

	"A Framework for Install-Time Optimization of Binary Dynamic-Link Libraries.", Christopher N. George, MS thesis, Department of Electrical and Computer Engineering, University of Illinois, Urbana IL, May 1999. [more...]

	"An Empirical Study of Function Pointers Using Spec Benchmarks.", Ben-Chung Cheng, Wen-mei Hwu, IMPACT Technical Report, IMPACT-99-02, University of Illinois, Urbana, IL 1999. [more...]

	"The Program Decision Logic Approach to Predicated Execution.", David I. August, John W. Sias, Jean-Michel Puiatti, Scott A. Mahlke, Daniel A. Connors, Kevin M. Crozier, Wen-mei Hwu, Proceedings of the 26th International Symposium on Computer Architecture, May, 1999. [more...]

	"A Hardware-Driven Profiling Scheme for Identifying Program Hot Spots to Support Runtime Optimization.", Matthew C. Merten, Andrew R. Trick, Christopher N. George, John C. Gyllenhaal, Wen-mei Hwu, Proceedings of the 26th International Symposium on Computer Architecture, pp. 136-147, May, 1999. (Paper of IMPACT - Cited Greater Than 150 Times) [more...]

	"A New Framework for Debugging Globally Optimized code.", Le-Chun Wu, Rajiv Mirani, Harish Patil, Bruce Olsen, Wen-mei Hwu, Proceedings of the 1999 ACM SIGPLAN Conference on Programming Language Design and Implementation, Atlanta, Georgia, May, 1999. [more...]

	"A Practical Interprocedural Pointer Analysis Framework.", Ben-Chung Cheng, Wen-mei Hwu, IMPACT Technical Report, IMPACT-99-01, University of Illinois, Urbana, IL 1999. [more...]

	"Optimizing Memory Accesses Using Advanced Compile-Time Memory Disambiguation Techniques.", Ben-Chung Cheng, Daniel A. Connors, Wen-mei Hwu, IMPACT Technical Report, IMPACT-99-03, University of Illinois, Urbana, IL 1999. [more...]

1998

	"A Software-Oriented Floating-Point Format for Enhancing Automotive Control Systems.", Daniel A. Connors, Yoji Yamada, Wen-mei Hwu, Workshop on Compiler and Architecture Support for Embedded Computing Systems (CASES98), December, 1998. [more...]

	"Compiler-Directed Early Load-Address Generation.", Ben-Chung Cheng, Daniel A. Connors, Wen-mei Hwu, Proceedings of the 31st International Symposium on Microarchitecture, December, 1998. [more...]

	"Effective Cluster Assignment for Modulo Scheduling", Erik M. Nystrom, Alexandre E. Eichenberger, Proceedings of the 31th International Symposium on Microarchitecture, Dec, 1998. (Paper of IMPACT - Cited Greater Than 100 Times) [more...]

	"Improving Static Branch Prediction in a Compiler", Brian L. Deitrich, Ben-Chung Cheng, Wen-mei Hwu, Proceedings of International Parallel Architecture and Compilation Techniques, October 12-18, 1998. [more...]

	"New Data-Location Tracking Scheme for the Recovery of Expected Variable Values", Le-Chun Wu, Wen-mei Hwu, IMPACT Technical Report, IMPACT-98-07, University of Illinois, Urbana, IL 1998. [more...]

	"Optimization and Executable Regeneration in the IMPACT Binary Reoptimization Framework.", Michael S. Thiems, MS thesis, Department of Electrical and Computer Engineering, University of Illinois, Urbana IL, August 1998. [more...]

	"Optimization of Machine Descriptions for Efficient Use.", John C. Gyllenhaal, Wen-mei Hwu, B. Ramakrishna Rau, International Journal of Parallel Programming, vol. 26, No. 4, pp. 417-447, August 1998. [more...]

	"Integrated Predicated and Speculative Execution in the IMPACT EPIC Architecture.", David I. August, Daniel A. Connors, Scott A. Mahlke, John W. Sias, Kevin M. Crozier, Ben-Chung Cheng, Patrick R. Eaton, Qudus B. Olaniran, Wen-mei Hwu, Proceedings of the 25th International Symposium on Computer Architecture, July, 1998. (Paper of IMPACT - Cited Greater Than 150 Times) [more...]

	"An Overview of the IMPACT X86 Binary Reoptimization Framework", Matthew C. Merten, Michael S. Thiems, IMPACT Technical Report, IMPACT-98-05, University of Illinois, Urbana, IL 1998. [more...]

	"A New Breakpoint Implementation Scheme for Debugging Globally Optimized Code", Le-Chun Wu, Wen-mei Hwu, IMPACT Technical Report, IMPACT-98-06, University of Illinois, Urbana, IL 1998. [more...]

	"A Novel Breakpoint Implementation Scheme for Debugging Optimized Code", Le-Chun Wu, Wen-mei Hwu, IMPACT Technical Report, IMPACT-98-01, University of Illinois, Urbana, IL 1998. [more...]

	"Emulation of the Intermediate Representation in the IMPACT Compiler", Qudus B. Olaniran, MS thesis. University of Illinois at Urbana-Champaign, 1998.. [more...]

	"Dynamic Control of Compile Time Using Vertical Region-Based Compilation", Jaymie L. Oehler (nee Braun), MS thesis. University of Illinois at Urbana-Champaign, 1998.. [more...]

	"Static Program Analysis to Enhance Profile Independence in Instruction-Level Parallelism Compilation", Brian L. Deitrich, Diss. University of Illinois at Urbana-Champaign, 1998.. [more...]

1997

	"A Study of the Cache and Branch Performance Issues with Running Java on Current Hardware Platforms.", Cheng-Hsueh Andrew Hsieh, Marie T. Conte, Teresa L. Johnson, John C. Gyllenhaal, Wen-mei Hwu, Proceedings of COMPCON, pp. 211-216, February 1997. [more...]

	"Run-time Spatial Locality Detection and Optimization.", Teresa L. Johnson, Matthew C. Merten, Wen-mei Hwu, Proceedings of the 30th International Symposium on Microarchitecture, December 1-3, 1997. (Paper of IMPACT - Cited Greater than 100 Times) [more...]

	"An Efficient Framework For Performing Execution-Constraint-Sensitive Transformations That Increase Instruction-Level Parallelism.", John C. Gyllenhaal, PhD thesis, Department of Electrical and Computer Engineering, University of Illinois, Urbana IL, September 1997. [more...]

	" Using NET to Capture Performance in Java-Based Software.", Cheng-Hsueh Andrew Hsieh, Marie T. Conte, Daniel R. Johnson, John C. Gyllenhaal, Wen-mei Hwu, IEEE Computer, pp. 67-75, June 1997. [more...]

	"Run-time adaptive cache hierarchy management via reference analysis", Teresa L. Johnson, Wen-mei Hwu, Proceedings of the 24th annual international symposium on Computer architecture, ISCA 1997. (Paper of IMPACT - Cited Greater than 200 Times) [more...]

	"Region-Based Compilation: An Introduction and Motivation.", Richard E. Hank, Wen-mei Hwu, B. Ramakrishna Rau, International Journal of Parallel Programming, vol. 25, no. 2, pp. 113-146, April 1997. (Paper of IMPACT - Cited Greater Than 100 Times) [more...]

	"A Framework for Balancing Control Flow and Predication.", David I. August, Wen-mei Hwu, Scott A. Mahlke, Proceedings of the 30th International Symposium on Microarchitecture, December 1997. (Paper of IMPACT - Cited Greater Than 100 Times) [more...]

	"A Study of the Cache and Branch Performance Issues with Running Java on Current Hardware Platforms.", Cheng-Hsueh Andrew Hsieh, Marie T. Conte, Teresa L. Johnson, John C. Gyllenhaal, Wen-mei Hwu, Proceedings of COMPCON, pp. 211-216, February 1997. [more...]

	"Architectural Support Compiler-Synthesized Dynamic Branch Prediction Strategies: Rationale and Initial Results.", David I. August, Daniel A. Connors, John C. Gyllenhaal, Wen-mei Hwu, Proceedings of the 3rd International Symposium on High-Performance Computer Architecture, Feb. 1997. [more...]

	"Modulo Scheduling for Control-Intensive General-Purpose Programs", Daniel M. Lavery, Diss. University of Illinois at Urbana-Champaign, 1997.. [more...]

	"Exploiting Instruction Level Parallelism in the Presence of Conditional Branches", Scott A. Mahlke, Diss. University of Illinois at Urbana-Champaign, 1997.. [more...]

	"A Framework for Using the Pentium's Performance Monitoring Hardware", Kevin D. Safford, MS thesis. University of Illinois at Urbana-Champaign, 1997.. [more...]

	"Structural and Static Analysis Techniqures for Enhancing Compiler Support of Predicated Execution", Kevin M. Crozier, MS thesis. University of Illinois at Urbana-Champaign, 1999.. [more...]

	"A Robust Foundation for Binary Translation of x86 Code", Liang-Chuan Hsu, Diss. 1997.. [more...]

	"Memory Profiling for Directing Data Speculative Optimizations and Scheduling", Daniel A. Connors, MS thesis. University of Illinois at Urbana-Champaign, 1997.. [more...]

1996

	"Java Bytecode to Native Code Translation: The Caffeine Prototype and Preliminary Results.", Cheng-Hsueh Andrew Hsieh, John C. Gyllenhaal, Wen-mei Hwu, Proceedings of the 29th International Symposium on Microarchitecture, pp. 90-99, December 1996. (Paper of IMPACT - Cited Greater Than 100 Times) [more...]

	"Modulo Scheduling of Loops in Control-Intensive Non-Numeric Programs.", Daniel M. Lavery, Wen-mei Hwu, Proceedings of the 29th Annual International Symposium on Microarchitecture, pp. 126-141, Dec. 1996. [more...]

	"Optimization of Machine Descriptions for Efficient Use.", John C. Gyllenhaal, Wen-mei Hwu, B. Ramakrishna Rau, Proceedings of the 29th International Symposium on Microarchitecture, pp. 349-358, December 1996. [more...]

	"Speculative Hedge: Regulating Compile-Time Speculation Against Profile Variations.", Brian L. Deitrich, Wen-mei Hwu, Proceedings of the 29th International Symposium on Microarchitecture, pp.70-79, December 2-4, 1996. [more...]

	"Region-Based Compilation.", Richard E. Hank, PhD thesis, Department of Electrical and Computer Engineering, University of Illinois, Urbana IL, May 1996. [more...]

	"Supporting Predicated Execution: Techniques and Tradeoffs.", Jim E. McCormick, MS thesis, Department of Electrical and Computer Engineering, University of Illinois, Urbana IL, May 1996. [more...]

	"HMDES Version 2.0 Specification.", John C. Gyllenhaal, Wen-mei Hwu, IMPACT Technical Report, IMPACT-96-03, University of Illinois, Urbana, IL, 1996. [more...]

	"Hyperblock Performance Optimizaions for ILP Processors", David I. August, MS thesis. University of Illinois at Urbana-Champaign, 1996.. [more...]

	"Pinline: A Profile-Driven Automatic Inliner for the IMPACT Compiler", Ben-Chung Cheng, MS thesis. University of Illinois at Urbana-Champaign, 1997.. [more...]

1995

	"Compiler Technology for Future Microprocessors.", Wen-mei Hwu, Richard E. Hank, David M. Gallagher, Scott A. Mahlke, Daniel M. Lavery, Grant E. Haab, John C. Gyllenhaal, David I. August, Proceedings of the IEEE, Vol. 83, No. 12, pp. 1625-1640, December 1995. (Paper of IMPACT - Cited Greater Than 100 Times) [more...]

	"Region-Based Compilation: An Introduction and Motivation.", Richard E. Hank, Wen-mei Hwu, B. Ramakrishna Rau, Proceedings of the 28th Annual International Symposium on Microarchitecture, pp. 158-168, Dec. 1995. (Paper of IMPACT - Cited Greater Than 100 Times) [more...]

	"Unrolling-Based Optimizations for Modulo Scheduling.", Daniel M. Lavery, Wen-mei Hwu, Proceedings of the 28th Annual International Symposium on Microarchitecture, pp. 327-337, Dec. 1995. [more...]

	"A Comparison of Full and Partial Predicated Execution Support for ILP Processors.", Scott A. Mahlke, Richard E. Hank, Jim E. McCormick, David I. August, Wen-mei Hwu, Proceedings of the 22nd International Symposium on Computer Architecture, pp. 138-150, June 19955. (Paper of IMPACT - Cited Greater Than 200 Times) [more...]

	"Code Scheduling and Optimization for a Superscalar X86 Microprocessor.", Wayne F. Dugal, MS thesis, Department of Computer Science, University of Illinois, Urbana IL, May, 1995. [more...]

	"Enhancing Instruction Level Parallelism Through Complier-Controlled Speculation.", Roger A. Bringmann, PhD thesis, Department of Computer Science, University of Illinois, Urbana IL, May 1995. [more...]

	"Performance and Cost Analysis of the Execution Stage of Superscalar Microprocessors.", Dimitri C. Argyres, MS thesis, Department of Computer Science, University of Illinois, Urbana IL, May 1995. [more...]

	"Three Architectural Models for Compiler-Controlled Speculative Execution.", Pohua P. Chang, Nancy J. Warter, Scott A. Mahlke, William Y. Chen, Wen-mei Hwu, IEEE Transactions on Computers, Vol. 44, No. 4, pp. 481-494, April 1995. [more...]

	"The Importance of Prepass Code Scheduling for Superscalar and Superpipelined Processors.", Pohua P. Chang, Daniel M. Lavery, Scott A. Mahlke, William Y. Chen, Wen-mei Hwu, IEEE Transactions on Computers, Vol. 44, No. 3, pp. 353-370, March 1995. [more...]

	"Sentinel Scheduling with Recovery Blocks.", David I. August, Brian L. Deitrich, Scott A. Mahlke, Technical Report CRHC-95-05, 1995, Center for Reliable and High-Performance Computing, University of Illinois, Urbana, IL, Feb, 1995. [more...]

	"Data Relocation and Prefetching for Programs with Large Data Sets.", Yoji Yamada, PhD thesis, Department of Computer Science, University of Illinois, Urbana IL, 1995. [more...]

	"Compiler-Assisted Multiple Instruction Retry.", Chung-Chi Jim Li, Shyh-Kwei Chen, W. Kent Fuchs, Wen-mei Hwu, IEEE Transactions on Computers, Vol.44, No.1, Jan. 1995. [more...]

	"Memory Disambiguation to Facilitate Instruction-level Parallelism Compilation", David M. Gallagher, Diss. University of Illinois at Urbana-Champaign, 1995.. [more...]

	"Automatic Annotation of Instructions with Profiling Information", Teresa L. Johnson, MS thesis. University of Illinois at Urbana-Champaign, 1995.. [more...]

1994

	"Characterizing the Impact of Predicated Execution on Branch Prediction.", Scott A. Mahlke, Richard E. Hank, Roger A. Bringmann, John C. Gyllenhaal, David M. Gallagher, Wen-mei Hwu, Proceedings of the 27th International Symposium on Microarchitecture, pp. 217-227, December 1994. (Paper of IMPACT - Cited Greater Than 100 Times) [more...]

	"Data Relocation and Prefetching for Large Data Sets.", Yoji Yamada, John C. Gyllenhaal, Grant E. Haab, Wen-mei Hwu, Proceedings of the 27th Annual ACM/IEEE International Symposium on Microarchitecture, pp. 118-127, December, 1994. [more...]

	"Dynamic Memory Disambiguation Using the Memory Conflict Buffer.", David M. Gallagher, William Y. Chen, Scott A. Mahlke, John C. Gyllenhaal, Wen-mei Hwu, Proceedings of the 6th International Conference on Architecture Support for Programming Languages and Operating Systems, San Jose, California, pp.183-195, October, 1994. (Paper of IMPACT - Cited Greater Than 150 Times) [more...]

	"A Machine Description Language for Compilation.", John C. Gyllenhaal, MS thesis, Department of Electrical and Computer Engineering, University of Illinois, Urbana IL, Sept. 1994. [more...]

	"Compiler Support for SPARC Architecture Processors.", Roland G. Ouellette, MS thesis, Department of Computer Science, University of Illinois, Urbana IL, Sept. 1994. [more...]

	"Modulo Scheduling with Isomorphic Control Transformations.", Nancy J. Warter, PhD thesis, Department of Computer Science, University of Illinois, Urbana IL, Sept. 1994. [more...]

	"The Susceptibility of Programs to Context Switching Effects.", Wen-mei Hwu, Thomas M. Conte, IEEE Transactions on Computers, Vol. 43, No. 9, Sept. 1994. [more...]

	"Compiler-Assisted Multiple Instruction Rollback Recovery Using A Read Buffer.", Neal J. Alewine, Shyh-Kwei Chen, W. Kent Fuchs, Wen-mei Hwu, IEEE Transactions on Computers, 1994. [more...]

	"Performance Implications of Synchronization Support for Parallel FORTRAN Programs.", Sadun Anik, Wen-mei Hwu, Journal of Parallel and Distributed Computing, Vol. 22, pp. 202-215, 1994. [more...]

	"Incremental Compiler Transformations for Multiple Instruction Retry.", Shyh-Kwei Chen, Neal J. Alewine, W. Kent Fuchs, Wen-mei Hwu, Software, Practice & Experience, John Wiley & Sons Ltd., Vol. 24(9), pp. 1-20, Sept. 1994. [more...]

	"Profile-Assisted Instruction Scheduling.", William Y. Chen, Scott A. Mahlke, Nancy J. Warter, Sadun Anik, Wen-mei Hwu, International Journal for parallel Programming, Vol. 22, No. 2, pp. 151-181, April 1994. [more...]

1993

	"Machine Independent Register Allocation for the IMPACT-I C Compiler.", Richard E. Hank, MS thesis, Department of Electrical and Computer Engineering, University of Illinois, Urbana IL, 1993. [more...]

	"Speculative Execution Exception Recovery using Write-back Suppression.", Roger A. Bringmann, Scott A. Mahlke, Richard E. Hank, John C. Gyllenhaal, Wen-mei Hwu, Proceedings of the 26th Annual ACM/IEEE Int'l Symposium on Microarchitecture, Austin, Texas, pp. 214-223, Dec. 1993. [more...]

	"Superblock Formation Using Static Program Analysis.", Richard E. Hank, Scott A. Mahlke, Roger A. Bringmann, John C. Gyllenhaal, Wen-mei Hwu, Proceedings of the 26th Annual ACM/IEEE Int'l Symposium on Microarchitecture, Austin, Texas, pp. 247-256, Dec. 1993. (Paper of IMPACT - Cited Greater than 100 Times) [more...]

	"Architectural and Software Support for Executing Numerical Applications on High Performance Computers.", Sadun Anik, PhD thesis, Department of Computer Science, University of Illinois, Urbana IL, CRHC-93-19, Sept. 1993. [more...]

	"Data Preload for Superscalar and VLIW Processors.", William Y. Chen, PhD thesis, Department of Computer Science, University of Illinois, Urbana, IL, Sept. 1993. [more...]

	"Sentinel Scheduling: A Model for Compiler-Controlled Speculative Execution.", Scott A. Mahlke, William Y. Chen, Roger A. Bringmann, Richard E. Hank, Wen-mei Hwu, B. Ramakrishna Rau, Michael S. Schlansker, ACM Transactions on Computer Systems, Vol. 11, No. 4, Nov. 1993. (Paper of IMPACT - Cited Greater than 100 Times) [more...]

	"Reverse If-Conversion", Nancy J. Warter, Scott A. Mahlke, Wen-mei Hwu, B. Ramakrishna Rau, Proceeding PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation, 1993. (Paper of IMPACT - Cited Greater than 100 Times) [more...]

	"Register Connection: A New Approach to Adding Registers into Instruction Set Architectures.", Tokuzo Kiyohara, Scott A. Mahlke, William Y. Chen, Roger A. Bringmann, Richard E. Hank, Sadun Anik, Wen-mei Hwu, Proceedings of the 20th Annual International Symposium on Computer Architecture, pp. 247-256, San Diego, CA, May 17-19, 1993. [more...]

	"XPROF: An Execution Profiler for Window-oriented Applications.", Aloke Gupta, Wen-mei Hwu, Software, Practice & Experience, John Wiley & Sons Ltd., Vol. 23 (5), pp. 487-510, May 1993 . [more...]

	"The Superblock: An Effective Technique for VLIW and Superscalar Compilation", Wen-mei Hwu, Scott A. Mahlke, William Y. Chen, Pohua P. Chang, Nancy J. Warter, Roger A. Bringmann, Roland G. Ouellette, Richard E. Hank, Tokuzo Kiyohara, Grant E. Haab, John G. Holm, Daniel M. Lavery, Journal of Supercomputing, 1993. (Paper of IMPACT - Cited Greater Than 750 Times) [more...]

	"Performance Aspects of Computers with Graphical User Interfaces.", Aloke Gupta, PhD thesis, Department of Computer Science, University of Illinois, Urbana IL, CRHC-93-09, April 1993. [more...]

	"The Benefit of Predicated Execution for Software Pipelining.", Nancy J. Warter, Daniel M. Lavery, Wen-mei Hwu, Proceedings of the 26th Annual Hawaii Int'l Conference on system Sciences, Wailea, pp 497-506, Hawaii, Jan. 5-8, 1993. [more...]

1992

	"Efficient Instruction Sequencing with Inline Target Insertion.", Wen-mei Hwu, Pohua P. Chang, IEEE Transactions on Computers, Vol. 41, No.12, pp. 1537-1551, Dec. 1992. [more...]

	"Enhanced Modulo Scheduling for Loops with Conditional Branches.", Nancy J. Warter, Grant E. Haab, Krishna Subramanian, John W. Bockhaus, Proceedings of 25th Annual ACM/IEEE Int'l Symposium on Microarchitecture, pp. 170-179, Dec. 1992. (Paper of IMPACT - Cited Greater Than 100 Times) [more...]

	"Code Scheduling for VLIW/Superscalar Processors with Limited Register Files.", Tokuzo Kiyohara, John C. Gyllenhaal, Proceedings of the 25th International Symposium on Microarchitecture, pp. 197-201, Dec. 1992. [more...]

	"Effective Compiler Support for Predicated Execution Using the Hyperblock.", Scott A. Mahlke, David C. Lin, William Y. Chen, Richard E. Hank, Roger A. Bringmann, Proceedings of the 25th International Symposium on Microarchitecture, pp. 45-54, Dec. 1992. (Received Micro Test-of-Time Award / Paper of IMPACT - Cited Greater Than 750 Times) [more...]

	"Compiler Code Transformations for Superscalar-Based High-Performance Systems.", Scott A. Mahlke, William Y. Chen, John C. Gyllenhaal, Wen-mei Hwu, Pohua P. Chang, Tokuzo Kiyohara, Proceedings of Supercomputing 1992, Minneapolis, Minnesota, pp. 808-817, Nov. 16-20, 1992. [more...]

	"Sentinel Scheduling for VLIW and Superscalar Processors.", Scott A. Mahlke, William Y. Chen, Wen-mei Hwu, B. Ramakrishna Rau, Michael S. Schlansker, Proceedings of the Fifth Int'l Conference on Architecture Support for Programming Languages and Operating Systems (ASPLOS-V), Boston, MA, pp.238-247, Oct. 12-15, 1992. (Paper of IMPACT - Cited Greater Than 150 Times) [more...]

	"A Template for Code Generator Development Using the IMPACT-I C Compile.", Roger A. Bringmann, MS thesis, Department of Computer Science, University of Illinois, Urbana IL, Sept. 1992. [more...]

	"Systematic Computer Architecture Prototyping.", Thomas M. Conte, PhD thesis, Department of Computer Science, University of Illinois, Urbana IL, Sept. 1992. [more...]

	"Design and Implementation of a Portable Global Code Optimizer.", Scott A. Mahlke, MS thesis, Department of Computer Science, University of Illinois, Urbana IL, Sept. 1992. [more...]

	"Evaluation of Some Superscalar and VLIW Processor Designs.", John G. Holm, MS thesis, Department of Computer Science, University of Illinois, Urbana IL, Sept. 1992. [more...]

	"Executing Nested Parallel Loops On Shared-Memory Multiprocessors.", Sadun Anik, Wen-mei Hwu, Proceedings of the 21st Annual Int'l Conference on Parallel Processing, pp.(III) 241-244, St. Charles, IL, Aug. 1992. [more...]

	"Tolerating First Level Memory Access Latency In High-Performance Systems.", William Y. Chen, Scott A. Mahlke, Wen-mei Hwu, Proceedings of the 21st Annual Int'l Conference on Parallel Processing, pp.(I) 36-43, St Charles, IL, Aug. 1992 . [more...]

	"Tolerating Data Access Latency with Register Preloading.", William Y. Chen, Scott A. Mahlke, Wen-mei Hwu, Tokuzo Kiyohara, Pohua P. Chang, Proceedings of the 1992 Int'l Conf. on Supercomputing, pp. 104-113, Washington D.C., July, 1992 . [more...]

	"Branch Recovery with Compiler-Assisted Multiple Instruction Retry.", Neal J. Alewine, Shyh-Kwei Chen, Chung-Chi Jim Li, W. Kent Fuchs, Wen-mei Hwu, Proceedings of the 22nd Annual International Symposium on Fault-Tolerant Computing, pp. 66-73, Boston, MA, July 8-10, 1992. [more...]

	"Profile-Guided Automatic Inline Expansion for C Programs.", Pohua P. Chang, Scott A. Mahlke, William Y. Chen, Wen-mei Hwu, Software Practice and Experience, May 1992, Vol. 22, No. 5, pp. 349-369 . (Paper of IMPACT - Cited Greater Than 150 Times) [more...]

	"An Execution Profiler for Window-Oriented Applications", Aloke Gupta, Wen-mei Hwu, Coordinated Science Lab, University of Illinois, Urbana, IL, Technical Report CRHC-92-02, 1992. [more...]

	"Scalar Program Performance on Multiple-Instruction-Issue Processors with a Limited Number of Registers.", Scott A. Mahlke, William Y. Chen, Pohua P. Chang, Wen-mei Hwu, Proceedings of the 25th Annual Hawaii Int'l Conference on System Sciences, pp. 34-44, Jan. 6-9, 1992. [more...]

1991

	"Three Superblock Scheduling Models for Superscalar and Superpipelined Processors.", Pohua P. Chang, Nancy J. Warter, Scott A. Mahlke, William Y. Chen, Wen-mei Hwu, Technical Report CRHC-91-29, Center for Reliable and High-Performance Computing, University of Illinois, Urbana, IL, Dec. 1991. [more...]

	"Using Profile Information to Assist Classic Compiler Code Optimizations.", Pohua P. Chang, Scott A. Mahlke, Wen-mei Hwu, Software Practice and Experience, Vol. 21, No. 12, pp. 1301-1321, Dec. 1991. (Paper of IMPACT - Cited Greater Than 250 Times) [more...]

	"Comparing Static And Dynamic Code Scheduling for Multiple-Instruction-Issue Processors.", Pohua P. Chang, William Y. Chen, Scott A. Mahlke, Wen-mei Hwu, Proceedings of the 24th Annual ACM/IEEE Int'l Symposium on Microarchitecture, pp. 69-73, Albuquerque, New Mexico, Nov. 18-20,1991. [more...]

	"Data Access Microarchitectures for Superscalar Processor with Compiler-Assisted Data Prefetching.", William Y. Chen, Scott A. Mahlke, Pohua P. Chang, Wen-mei Hwu, Proceedings of the 24th Annual ACM/IEEE Int'l Symposium on Microarchitecture, pp. 69-73, Albuquerque, New Mexico, Nov. 1991. (Paper of IMPACT - Cited Greater Than 100 Times) [more...]

	"An Optimizing Compiler Code Generator: A platform for RISC Performance Analysis", William Y. Chen, MS thesis, Department of Computer Science, University of Illinois, Urbana IL, Sept. 1991. [more...]

	"The Effect of Compiler Optimizations On Available Parallelism In Scalar Programs.", Scott A. Mahlke, Nancy J. Warter, William Y. Chen, Pohua P. Chang, Wen-mei Hwu, Proceedings of the 20th Annual Int'l Conference on Parallel Processing, St. pp. 142-145, Charles, IL, Aug. 12-16, 1991. [more...]

	"Performance Implications of Synchronization Support for Parallel Fortran Programs.", Sadun Anik, Wen-mei Hwu, Technical Report CRHC-91-21, Center for Reliable and High-Performance Computing, University of Illinois, Urbana, IL, Jun. 1991. [more...]

	"IMPACT: An Architectural Framework for Multiple-Instruction-Issue Processors.", Pohua P. Chang, Scott A. Mahlke, William Y. Chen, Nancy J. Warter, Wen-mei Hwu, Proceedings of the 18th Annual Int'l Symposium on Computer Architecture, Toronto, Canada, pp. 266-275, May 28, 1991. (Paper of IMPACT - Cited Greater Than 450 Times) [more...]

	"The Effect of Code Expanding Optimizations of Instruction Cache Design.", William Y. Chen, Pohua P. Chang, Thomas M. Conte, Wen-mei Hwu, Technical Report CRHC-91-17, Center for Reliable and High-Performance, university of Illinois, Urbana, IL, May 1991. [more...]

	"Benchmark Characterization.", Thomas M. Conte, Wen-mei Hwu, Proceedings of the 24th Annual Hawaii International Conference on System Sciences, pp. 364-372, Jan. 8-11, 1991. [more...]

	"Compiler Support for Multiple-instruction-issue Architectures", Pohua P. Chang, PhD Thesis. Unversity of Illinois at Urbana-Champaign, 1992.. [more...]

1990

	"A Multiported Nonblocking Cache for a Superscalar Uniprocessor.", Jim E. Sicolo, MS thesis, Department of Computer Science, University of Illinois, Urbana IL, Sept. 1990. [more...]

	"Compiler Support for Predicated Execution in Superscalar Processors.", David C. Lin, MS thesis, Department of Computer Science, University of Illinois, Urbana IL, Sept. 1990 . [more...]

	"Compiler-Assisted Signature Monitoring", Nancy J. Warter, Coordinated Science Laboratory Report no. UILU-ENG-90-2236(1990).. [more...]

	"Benchmark characterization for experimental system evaluation", Thomas M. Conte, Wen-mei Hwu, Proceedings of the Twenty-Third Annual Hawaii International Conference on System Sciences , 1990. [more...]

1989

	"Aggressive Code Improving Techniqures Based on Control Flow Analysis", Pohua P. Chang, Coordinated Science Laboratory Report no. UILU-ENG-89-2228 (1989).. [more...]

	"Inline function expansion for compiling C programs", Pohua P. Chang, Wen-mei Hwu, Proceeding PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation, 1989 . (Paper of IMPACT - Cited Greater than 100 Times) [more...]

	"Control Flow Optimization for Supercomputer Scalar Processing.", Pohua P. Chang, Wen-mei Hwu, Proceedings of the 1989 Int'l Conf. on Supercomputing, Crete, Greece, Jun. 5-9, 1989. [more...]

	"Comparing Software and Hardware Schemes For Reducing the Cost of Branches.", Wen-mei Hwu, Thomas M. Conte, Pohua P. Chang, Proceedings of the 16th Annual International Symposium on Computer Architecture, Jerusalem, Israel, pp. 224-233, May 28- June 1, 1989. [more...]

1988

	"Exploiting parallel microprocessor microarchitectures with a compiler code generator", Wen-mei Hwu, Pohua P. Chang, Proceeding ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture, 1988. [more...]

	"Trace selection for compiling large C application programs to microcode", Pohua P. Chang, Wen-mei Hwu, Proceeding MICRO 21 Proceedings of the 21st annual workshop on Microprogramming and microarchitecture, 1988. (Paper of IMPACT - Cited Greater than 100 Times) [more...]

	"The Simulation and Tuning of the Global Memory Subsystem of a Multiprocessor", Thomas M. Conte, MS thesis. University of Illinois at Urbana-Champaign, 1988.. [more...]

1987

	"Checkpoint Repair for High-Performance Out-of-Order Execution Machines", Wen-mei Hwu, Yale N. Patt, Computers, IEEE Transactions on, 1987. (Paper of IMPACT - Cited Greater than 150 Times) [more...]

	"Checkpoint repair for out-of-order execution machines", Wen-mei Hwu, Yale N. Patt, Proceeding ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture, 1987. (Paper of IMPACT - Cited Greater than 100 Times) [more...]

1986

	"HPSm, a high performance restricted data flow architecture having minimal functionality", Wen-mei Hwu, Yale N. Patt, Proceeding ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture, June, 1986 . (Paper of IMPACT - Cited Greater than 100 Times) [more...]

1985

	"Critical issues regarding HPS, a high performance microarchitecture", Yale N. Patt, Stephen W. Melvin, Wen-mei Hwu, Michael Shebanow, Proceeding MICRO 18 Proceedings of the 18th annual workshop on Microprogramming, 1985. [more...]

	"HPS, a new microarchitecture: rationale and introduction", Yale N. Patt, Wen-mei Hwu, Michael Shebanow, Proceedings of the 18th annual workshop on Microprogramming, MICRO 18, 1985. (Paper of IMPACT - Cited Greater than 150 Times) [more...]