This page records my understanding and notes when reading and modifying the amdgpu kernel module source code and corresponding documents.

Commands are deliver via the user space thin layer roct to the amdgpu kernel module by the ioctl system call with specific argument. GPU is treated as a character device1 communicating with OS driver with a stream of characters (rather than blocks of data, used by block device).

Glossary

Record the meaning of notations and abbreviations appearing in the kernel documents.

  • GEM: graphics execution manager developed by Intel initially, used for device memory management2 and focus on simplicity
  • TTM: translation table manager, another device memory management2, suitable for both UMA and devices with dedicated RAM

GEM vs. TTM 1

Note that GEM and TTM are both driver memory management approaches. But GEM does not support dedicated memory addressing (it supports UMA only), hence it is usually used as the frontend of user space API. When it comes to isolated memory like VRAM, TTM is employed, with complains from developers due to its “one-size-fits-all” design philosophy.

  • BO: buffer object,
  • IRQ: interrupt request3
  • DRM: direct render manager45, command interface between userspace and driver, including two parts: KMS (kernel mode setting) and render
  • KCL: kernel compatible layer, used to accommodate the module with different versions of kernels (i.e., the ever-changing interfaces)
  • KFD: kernel fusion driver, used for ROCm computing programs. Has been merged into the DRM driver67
  • GIM: GPU-IOV module, a kernel module for the hardware virtualization for AMD MxGPU8
  • XCP: virtualization related module component, none of our business
  • KGD: kernel graphics driver
  • SDMA: system DMA

Kernel module components

The above image demonstrates the built components and the associated dependencies among each other, view via lsmod | grep amd.

Memory domains

There are several memory domains that specifies a memory region’s attribute:

  • AMDGPU_GEM_DOMAIN_CPU: GPU is not accessible, may be swapped out
  • AMDGPU_GEM_DOMAIN_GTT: GPU accessible host memory, mapped to the GPU virtual address space
  • AMDGPU_GEM_DOMAIN_VRAM: memory located in the GPU-attached video memory
  • AMDGPU_GEM_DOMAIN_GDS: global data sharing, used to share data across threads
  • AMDGPU_GEM_DOMAIN_GWS: global wave sync, used to synchronize wavefront executions on a device
  • AMDGPU_GEM_DOMAIN_OA: ordered append
  • AMDGPU_GEM_DOMAIN_DOORBELL: an MMIO region for signaling user mode queues
  • AMDGPU_GEM_DOMAIN_DGMA: direct GPU memory access, used for peer PCIe devices9

Module settings

Module parameter and configuration values affect the conditional compile branch as well as the runtime behavior.

Parameters

To check the configurable amdgpu kernel module parameters:

modinfo amdgpu | grep "parm:"

According to the sysfs conventions in Linux, parameter values can be also checked in:

/sys/module/amdgpu/parameters

Configs

There are also some configurations set when building the kernel or system boot, check via:

cat /boot/config-$(uname-r)

Build amdgpu kernel module from source

To retrieve the source, go to the Ubuntu software package website maintained by AMD: https://repo.radeon.com/amdgpu/latest/ubuntu/pool/main/a/amdgpu-dkms in the form of distro release packages. Then download the amdgpu-dkms*.deb file and extract it via dpkg-deb, the required source code is managed by dkms.

Another way

Note that if you already install the amdgpu driver via amdgpu-install install script, the related source code is also located at /usr/src/amdgpu-*. And the firmwares are placed in /lib/firmware/amdgpu and /lib/firmware/updates/amdgpu.

we can derive that amdgpu-dkms is built via dkms from its name:

dkms build -m amdgpu -v 6.3.6-1697589.22.04 --sourcetree=`pwd` --dkmstree=/path/to/tmp/build

It will seek to the headers of currently loaded kernel, and the dkmstree option specifies the artifact output path.

Trace and debug the amdgpu kernel module

ftrace is a tracing tool used to intercept kernel functions during a period of time1011.

To use ftrace, the tracefs (or sometimes, debugfs, mounted in path /sys) has to been mounted on your system and you can access and edit the trace configurations the via the sysfs file interfaces. One convenient method is to install and use the trace-cmd utility12 offered by mainstream distros, which enables users to use ftrace with the CLI interface in one place rather than manipulating scattered files.

To effectively inspect what functions are running inside the kernel (or module/driver), it’s common to combine ftrace + function filter to extract useful information that you may concern.

Warning trace vs. trace_pipe

There are two result files in the tracefs, trace and trace_pipe. The former is a consistent one that allows users to read it multiple times, returning identical trace contents, while the latter obeys the producer-consumer protocol, whose content will be consumed once read. It’s also worth noting that reading trace leads to the pause of kernel tracing, while trace_pipe does not.

ftrace use cases

There is an awesome introduction and go-through tutorial13 presented by Steven Rostedt about the basic usage of ftrace. Individuals cares about kernel module development should have a look.

Task filter

Usually we only focus on the executable we are going to start next, in this scenario, use Shell built-in functions to achieve it:

sudo echo 0 > tracing_on
sudo echo $$ >> set_ftrace_pid; echo 1 > tracing_on; exec a.out

Note that exec will transfer the PID of bash to the program you start, so do not append any command after the exec statement.

Function graph

Below shows the process to dump the function graph of tracing kernel function in module starting with amd.

echo 0 > tracing_on
echo > trace
 
echo function_graph > current_tracer
echo '*:mod:amd*' > set_ftrace_filter
 
sudo echo $$ >> set_ftrace_pid; echo 1 > tracing_on; exec a.out

The following shows the function invoke chain collected via the ftrace function graph tracer:

kfd_ioctl [amdgpu]() {
  kfd_ioctl_alloc_memory_of_gpu [amdgpu]() {
    svm_range_list_lock_and_flush_work [amdgpu]();
    kfd_process_device_data_by_id [amdgpu]();
    kfd_bind_process_to_device [amdgpu]();
    amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu [amdgpu]() {
      amdgpu_sync_create [amdgpu]();
      amdgpu_amdkfd_reserve_mem_limit [amdgpu]();
      amdgpu_gem_object_create [amdgpu]() {
        amdgpu_bo_create_user [amdgpu]() {
          amdgpu_bo_create [amdgpu]() {
            amdgpu_bo_placement_from_domain [amdgpu]();
            amdttm_bo_init_reserved [amdttm]() {
              amdttm_bo_validate [amdttm]() {
                amdttm_bo_mem_space [amdttm]() {
                  amddma_resv_reserve_fences [amdkcl]() {
                    dma_resv_list_alloc [amdkcl]();
                  }
                  ttm_resource_alloc [amdttm]() {
                    ttm_sys_man_alloc [amdttm]() {
                      amdttm_resource_init [amdttm]();
                    }
                    ttm_resource_add_bulk_move [amdttm]();
                  }
                  ttm_bo_add_move_fence.constprop.0 [amdttm]();
                }
                ttm_bo_handle_move_mem [amdttm]() {
                  ttm_mem_io_free [amdttm]();
                  ttm_tt_create [amdttm]() {
                    amdgpu_ttm_tt_create [amdgpu]() {
                      amdttm_sg_tt_init [amdttm]();
                    }
                  }
                  amddma_resv_reserve_fences [amdkcl]();
                  amdgpu_bo_move [amdgpu]() {
                    amdttm_resource_free [amdttm]();
                    amdgpu_bo_move_notify [amdgpu]() {
                      amdgpu_vm_bo_invalidate [amdgpu]();
                    }
                  }
                }
                ttm_tt_create [amdttm]();
              }
            }
            amdgpu_cs_report_moved_bytes [amdgpu]();
            amdttm_bo_move_to_lru_tail [amdttm]() {
              ttm_resource_move_to_lru_tail [amdttm]();
            }
          }
        }
      }
      amdgpu_ttm_tt_set_userptr [amdgpu]();
      amdgpu_hmm_register [amdgpu]();
      amdgpu_ttm_tt_get_user_pages [amdgpu]() {
        amdgpu_hmm_range_get_pages [amdgpu]();
      }
      amdgpu_bo_placement_from_domain [amdgpu]();
      amdttm_bo_validate [amdttm]() {
        ttm_resource_compat [amdttm]() {
          ttm_resource_places_compat [amdttm]();
        }
        amdttm_bo_mem_space [amdttm]() {
          amddma_resv_reserve_fences [amdkcl]();
          ttm_resource_alloc [amdttm]() {
            amdgpu_preempt_mgr_new [amdgpu]() {
              amdttm_resource_init [amdttm]();
            }
            ttm_resource_add_bulk_move [amdttm]();
          }
          ttm_bo_add_move_fence.constprop.0 [amdttm]();
        }
        ttm_bo_handle_move_mem [amdttm]() {
          ttm_mem_io_free [amdttm]();
          ttm_tt_create [amdttm]();
          amdttm_tt_populate [amdttm]() {
            amdgpu_ttm_tt_populate [amdgpu]();
          }
          amddma_resv_reserve_fences [amdkcl]();
          amdgpu_bo_move [amdgpu]() {
            amdttm_resource_free [amdttm]() {
              ttm_resource_del_bulk_move [amdttm]();
              ttm_sys_man_free [amdttm]() {
                amdttm_resource_fini [amdttm]();
              }
            }
            amdgpu_bo_move_notify [amdgpu]() {
              amdgpu_vm_bo_invalidate [amdgpu]();
            }
          }
        }
      }
      amdttm_bo_move_to_lru_tail [amdttm]() {
        ttm_resource_move_to_lru_tail [amdttm]();
      }
      amdgpu_ttm_tt_get_user_pages_done [amdgpu]() {
        amdgpu_hmm_range_get_pages_done [amdgpu]();
      }
    }
    kfd_process_device_create_obj_handle [amdgpu]();
  }
}

Stack trace

Function graph only shows what routines the traced function calls, but it doesn’t give us who calls the interested function. The function stack trace helps.

Set function filter before enabling stack trace

If stack trace is enabled without function filter, then the stack dump happens every time there is an function call in the kernel, resulting in system stucks. One has to check the content of set_ftrace_filter before enabling the stack trace.

echo nop > current_tracer
echo schedule > set_ftrace_filter
cat set_ftrace_filter
 
echo 1 > options/func_stack_trace
echo function > current_tracer
bash -c "echo $$ >> set_ftrace_pid; echo 1 > tracing_on; exec a.out"

Trace for forked tasks

amdgpu module concepts and data structures

User space interfaces

GPU is presented as the character device in Linux system. And user program interacts with driver via the ioctl syscall together with control types arguments. For example, when allocating GPU memory, ioctl with AMDKFD_IOC_ALLOC_MEMORY_OF_GPU is called and kfd_ioctl_alloc_memory_of_gpu_args is passed.

struct kfd_ioctl_alloc_memory_of_gpu_args {
	__u64 va_addr;		/* to KFD */
	__u64 size;		/* to KFD */
	__u64 handle;		/* from KFD */
	__u64 mmap_offset;	/* to KFD (userptr), from KFD (mmap offset) */
	__u32 gpu_id;		/* to KFD */
	__u32 flags;
};

where the flags option specifies the requested memory type and access attributes. And va_addr is returned as the virtual address to user.

Obtain the PA of a BO

To dig out the procedure of PA <-> BO conversion, we therefore decide to trace the execution if hipFree, where the user pointer will be passed to driver.

Footnotes

  1. https://linux-kernel-labs.github.io/refs/heads/master/labs/device_drivers.html ↩

  2. https://docs.kernel.org/gpu/drm-mm.html ↩ ↩2

  3. https://docs.kernel.org/core-api/irq/concepts.html ↩

  4. https://docs.kernel.org/gpu/drm-internals.html ↩

  5. https://en.wikipedia.org/wiki/Direct_Rendering_Manager ↩

  6. https://www.phoronix.com/news/AMDKFD-Merge-Into-AMDGPU ↩

  7. https://lists.freedesktop.org/archives/amd-gfx/2018-July/023673.html ↩

  8. https://www.amd.com/en/graphics/workstation-virtual-graphics ↩

  9. https://www.activesilicon.com/products/directgma ↩

  10. https://docs.kernel.org/trace/ftrace-design.html ↩

  11. (Chinese) https://richardweiyang-2.gitbook.io/kernel-exploring/00-index-3/04-ftrace_internal ↩

  12. https://man7.org/linux/man-pages/man1/trace-cmd.1.html ↩

  13. https://www.youtube.com/watch?v=2ff-7UTg5rE ↩