Papers: Tensor Core

Last updated May 14, 2023

What’s Tensor Core (TC)? It’s a ASIC integrated in the general purpose GPU (GPGPU) to designed for accelerating GEMM workload composing a large portion of machine learning applications. However, since there are obstacles to exploit TC effectively in CUDA, programmers are hardly to make use of TC to speedup their appliations.

# Dissections & Microbenchmarks

# TC with Intra-SM Parallelism

# GEMM / Scientific / DL App. with TC

# GNN with TC