When building building ROCm from source code, a term named xnack
usually appears next to the GPU architecture code name with a suffix +
or -
(e.g., gfx908:xnack-
).
xnack
(pronounced as X-knack) refers to whether page migration will happen when a memory fault occurs in the GPU kernel execution.
When disabled (xpack-
), the accessed data (page) stays still at the host side (CPU DRAM), acting like a pinned CPU memory.
On the other hand, automatic migration happens right after the GPU page fault in favor of locality.
Subsequent requests then locate in the GPU VRAM.
xnack
in compiler
There is an option --offload-arch
in hipcc
which receives the target ID1 (e.g., sm_80
for NVIDIA Ampere and gfx1030:xnack-
for AMD Navi21, where gfx1030
is the processor and xnack-
is the optional target-feature).
When specifying AMD GPU as the compiled object target, a xnack
suffix can be appended with +
or -
indicating whether xnack
is enabled.
Note
Note that current version of LLVM will emit code object V4 for AMD GPU kernels1.
Under the default code object V4 without specifying the xnack
, GPU kernel runs well on devices regardless whether xnack
is supported or enabled.
Software requirements
Note that xnack
is available in recent computing GPUs (CDNA series), and xnack
is deprecated since RDNA2 (i.e., gfx10
and later)2 1.
Beyond this, some software requirements must be satisfied to enable xnack
in supported GPUs:
- Linux kernel with HMM
- Set the module parameter:
sudo echo 'options amdgpu noretry=0' > /etc/modprobe.d/amdgpu.conf
, and reload theamdgpu
kernel module. Check the value withcat /sys/module/amdgpu/parameters/noretry
(-1
in Ubuntu 22.04 by default) - Set
HSA_XNACK=1
since it’s disabled by default