Tensor Core Compiler

Team:

Description:

Deep learnings reliance on matrix-multiplication (GEMM) forcompute has driven both research and industry to develop matrix-multiplication accelerator hardware collectively called TensorCore Units (TCUs) in this paper. TCUs are designed to accel-erate Multilayer Perceptrons (MLP), Convolutional Neural Net-works (CNN), and Recurrent Neural Networks (RNN) or DeepNeural Network (DNN) in general. TCUs come under the guiseof different marketing terms, be it NVIDIAs Tensor Cores [54],Googles Tensor Processing Unit, Intels DLBoost, AppleA11s Neural Engine, Teslas HW3, or ARMs ML Processor.They vary in the underlying hardware implementation, and are prevalent in both cloud and edge devices. More information is available at https://tcu.c3sr.com

Related papers:
	"Accelerating Reduction and Scan Using Tensor Core Units", Abdul Dakkak, Cheng Li, Jinjun Xiong, Wen-mei Hwu, ICS 2019: International Conference on Supercomputing, June 26-28, Phoenix AZ. [more...]

Current Projects

Selected Past Projects

Tensor Core Compiler

Related papers: