ECE 412Computer ArchitectureLectures 20 and 21:Vector Processing and Vectorization
Lecture 20:Vector Processing
Vector Processing Implementations
Motivation
Vector Processing Unit (CRAY-1)
Vector Codefor a Register-to-Register Vector Architecture
Vector InstructionsBasic LOAD and STORE
Vector InstructionsGATHER and SCATTER
Vector InstructionsINDIRECT-LOAD, INDIRECT-STORE
The Mask Vector
Vector Register Access Delay
Interleaved Vector Access
Vector Control Setup Delay
Vector Result Forwarding / Chaining
Chaining Example
Amdahl’s Law
Lecture 21:Vectorization
Loop Carried Dependence
Basic Transformation for Vectorization
Loop Distribution Example
Vectorization of Loop Distributed Code
Problem
Question
Converting Single-Statement Loops into Vector Statements
Loop Interchange
Other Assisting Transformations
Vector Instruction Generation:Using vector registers of limited length
Loop sectioning example (cont.)
Performance Implications of Loop Sectioning
If-Conversion Using the Mask Register
Email: sias@crhc.uiuc.edu