Re: No check of the size passed to unmap_single in swiotlb

From: Konrad Rzeszutek Wilk
Date: Fri Nov 24 2017 - 08:33:51 EST


..snip..
> > There doesn't seem to be much good reason for SWIOTLB to be more special
> > than other DMA API backends, and not all of them have enough internal state to
> > be able to make such a check. It's also not necessarily possible to "prevent
> > damage" anyway - if a driver does pass a bogus size for dma_unmap_single(...,
> > DMA_FROM_DEVICE), SWIOTLB might be able to keep itself internally consistent,
> > but it still can't prevent the arch code in the middle from invalidating the wrong
> > cache lines and potentially corrupting adjacent memory.
> >
> > In short, trying to work around broken drivers is a much worse idea than just
> > fixing those drivers, and that's what we already have dma-debug for.
> >
> > Robin.
>
> Hi Robin,
>
> I agree that hacking kernel to fix broken drivers is not acceptable, actually we spent days to fight driver supplier with this, they do not want to change their code and want fix it directly in kernel.

You could upstream the driver? That way it will be fixed.

I am not sure if you are aware but there is a staging directory
as well in Linux.
>
> I tried Dma-debug yesterday, it works very well, but I think only the size mismatch check may not be enough for the map entry corrupt situation, some run-time warning may be better when the real corruption happen.
>
> For most of the dma-api backend, the size mismatch may do no harm at all, and even in SWIOTLB itself when the bounce buffer is not used, the size mismatch do no harm either. In our case, the same buggy driver works well when board has 2G DDR, but panic frequently in 4G DDR because of the use of bounce buffer and these corrupted map entries. it is hard to catch this kind of bugs, for when the corruption happen, the kernel has all kind of reasons to panic, but not even one may directly point to the real source.
>
> Add the warning messages is a big convenience for figure this kind of issues, at least to me and the AP driver supplier, such warnings may save weeks of hard debug time.

I would prefer that all of those warning messages go in the DMA debug
API - as you can have multiple DMA different backends.

In other words, instead of being in one implementation it would make
much more sense to be in the generic API layer.