HyperLink   Profiling and Characterization of Deep Learning Model Inference on CPU
Publication Year:
  Yanli Qian
  Master's Thesis

With the rapid growth of deep learning models and higher expectations for their accuracy and throughput in real-world applications, the demand for profiling and characterizing model inference on different hardware/software stacks is signicantly increased. As the model inference characterization on GPU has already been extensively studied, it is worth exploring how performance-enhancing libraries like Intel MKL-DNN help to boost the performance on Intel CPU. We develope a profiling mechanism to capture the MKL-DNN operation calls and formulate the tracing timeline with spans on the server. Through profiling and characterization that give insights into Intel MKL-DNN, we evaluate and demonstrate that the optimization techniques including blocked memory layout, layers fusion, and low precision operation used in deep learning model inference have accelerated the performance on the Intel CPU.