HyperLink   Chai: Collaborative Heterogeneous Applications for Integrated-architectures
Publication Year:
  Juan Gómez-Luna, Izzat El Hajj, Li-Wen Chang, Victor Garcia-Flores, Simon Garcia de Gonzalo, Thomas B. Jablin, Antonio J. Peña, Wen-mei Hwu
  IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2017

Heterogeneous system architectures are evolving towards tighter integration among devices, with emerging features such as shared virtual memory, memory coherence, and system-wide  atomics.  Languages,  device  architectures,  system  specifications,  and  applications  are  rapidly  adapting  to  the  challenges and opportunities of tightly integrated heterogeneous platforms. Programming  languages  such  as  OpenCL  2.0,  CUDA  8.0,  and C++  AMP  allow  programmers  to  exploit  these  architectures  for productive  collaboration  between  CPU  and  GPU  threads.  To evaluate these new architectures and programming languages, and to empower researchers to experiment with new ideas, a suite of benchmarks targeting these architectures with close CPU-GPU collaboration is needed.

In this paper, we classify applications that target heterogeneous architectures  into  generic  collaboration  patterns  including  data partitioning,  fine-grain  task  partitioning,  and  coarse-grain  task partitioning.  We  present  Chai,  a  new  suite  of  14  benchmarks that  cover  these  patterns  and  exercise  different  features  of  heterogeneous architectures with varying intensity. Each benchmark in  Chai is  implemented  in  five  different  programming  models including: OpenCL 2.0, OpenCL 1.2, C++ AMP, CUDA 7.5, and CUDA-Sim.  We  characterize  the  behavior  of  each  benchmark with  respect  to  varying  input  sizes  and  combinations  of  heterogeneity,  and  evaluate  the  impact  of  using  the  emerging  features of  heterogeneous  architectures  on  application  performance.