Re: [PATCH] amd/iommu: fix logic bug in amd_iommu_report_page_fault()

From: Luigi Rizzo
Date: Mon Aug 02 2021 - 10:31:11 EST


On Mon, Aug 2, 2021 at 4:13 PM Joerg Roedel <joro@xxxxxxxxxx> wrote:
>
> On Sat, Jul 31, 2021 at 12:26:37PM -0700, Luigi Rizzo wrote:
> > amd_iommu_report_page_fault() has two print paths, depending on whether or
> > not it can find a pci device. But the code erroneously enters the second
> > path if the rate limiter in the first path triggers:
> > if (dev_data && ratelimit(A)) { A; } else if (ratelimit(B)) { B; }
> > The correct code should be
> > if (dev_data) { if (ratelimit(A)) { A;} } else if (ratelimit(B)) { B; }
> >
> > Signed-off-by: Luigi Rizzo <lrizzo@xxxxxxxxxx>
> > ---
> > drivers/iommu/amd/iommu.c | 8 +++++---
> > 1 file changed, 5 insertions(+), 3 deletions(-)
>
> Thanks, but I queued this patch already:
>
> https://lore.kernel.org/r/YPgk1dD1gPMhJXgY@xxxxxxxxxxxxxx


Ah didn't realize that. Thank you!

Two questions on the topic:
1. how comes only the AMD driver is so verbose in reporting io page faults?
Neither intel nor other iommu drivers seem to log anything

2. Would it make sense to have a control to disable such logging,
either per-device or globally? Eg something like this (negative
logic so it must be set explicitly to disable logging).


@ -985,6 +985,7 @@ struct device {
defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL)
bool dma_coherent:1;
#endif
+ bool no_log_fault:1;

...
+ if (!pdev->dev.no_log_fault && __ratelimit(&dev_data->rs))
+ dev_err(&pdev->dev, "Event logged [IO_PAGE_FAULT dom


cheers
luigi