[PATCH v2 0/3] io-pgtable-arm + drm/msm: Extend iova fault debugging

From: Rob Clark
Date: Tue Oct 05 2021 - 11:11:57 EST


From: Rob Clark <robdclark@xxxxxxxxxxxx>

This series extends io-pgtable-arm with a method to retrieve the page
table entries traversed in the process of address translation, and then
beefs up drm/msm gpu devcore dump to include this (and additional info)
in the devcore dump.

The motivation is tracking down an obscure iova fault triggered crash on
the address of the IB1 cmdstream. This is one of the few places where
the GPU address written into the cmdstream is soley under control of the
kernel mode driver, so I don't think it can be a userspace bug. The
logged cmdstream from the devcore's I've looked at look correct, and the
TTBR0 read back from arm-smmu agrees with the kernel emitted cmdstream.
Unfortunately it happens infrequently enough (something like once per
1000hrs of usage, from what I can tell from our telemetry) that actually
reproducing it with an instrumented debug kernel is not an option. So
further spiffying out the devcore dumps and hoping we can spot a clue is
the plan I'm shooting for.

See https://gitlab.freedesktop.org/drm/msm/-/issues/8 for more info on
the issue I'm trying to debug.

v2: Fix an armv7/32b build error in the last patch

Rob Clark (3):
iommu/io-pgtable-arm: Add way to debug pgtable walk
drm/msm: Show all smmu info for iova fault devcore dumps
drm/msm: Extend gpu devcore dumps with pgtbl info

drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 2 +-
drivers/gpu/drm/msm/adreno/adreno_gpu.c | 35 +++++++++++++++++-----
drivers/gpu/drm/msm/msm_gpu.c | 10 +++++++
drivers/gpu/drm/msm/msm_gpu.h | 10 ++++++-
drivers/gpu/drm/msm/msm_iommu.c | 17 +++++++++++
drivers/gpu/drm/msm/msm_mmu.h | 2 ++
drivers/iommu/io-pgtable-arm.c | 40 ++++++++++++++++++++-----
include/linux/io-pgtable.h | 9 ++++++
8 files changed, 107 insertions(+), 18 deletions(-)

--
2.31.1