Re: [PATCH v2] dma-direct: improve DMA_ATTR_NO_KERNEL_MAPPING

From: Walter Wu
Date: Thu Nov 04 2021 - 09:40:56 EST


On Thu, 2021-11-04 at 13:47 +0100, Ard Biesheuvel wrote:
> On Thu, 4 Nov 2021 at 13:31, Walter Wu <walter-zh.wu@xxxxxxxxxxxx>
> wrote:
> >
> > On Thu, 2021-11-04 at 09:57 +0100, Ard Biesheuvel wrote:
> > > On Thu, 4 Nov 2021 at 09:53, Christoph Hellwig <hch@xxxxxx>
> > > wrote:
> > > >
> > > > On Thu, Nov 04, 2021 at 10:32:21AM +0800, Walter Wu wrote:
> > > > > diff --git a/include/linux/set_memory.h
> > > > > b/include/linux/set_memory.h
> > > > > index f36be5166c19..6c7d1683339c 100644
> > > > > --- a/include/linux/set_memory.h
> > > > > +++ b/include/linux/set_memory.h
> > > > > @@ -7,11 +7,16 @@
> > > > >
> > > > > #ifdef CONFIG_ARCH_HAS_SET_MEMORY
> > > > > #include <asm/set_memory.h>
> > > > > +
> > > > > +#ifndef CONFIG_RODATA_FULL_DEFAULT_ENABLED
> > > >
> > > > This is an arm64-specific symbol, and one that only controls a
> > > > default. I don't think it is suitable to key off stubs in
> > > > common
> > > > code.
> > > >
> > > > > +static inline int set_memory_valid(unsigned long addr, int
> > > > > numpages, int enable) { return 0; }
> > > >
> > > > Pleae avoid overly long lines.
> > > >
> > > > > + if
> > > > > (IS_ENABLED(CONFIG_RODATA_FULL_DEFAULT_ENABLED))
> > > > > {
> > > > > + kaddr = (unsigned
> > > > > long)phys_to_virt(dma_to_phys(dev, *dma_handle));
> > > >
> > > > This can just use page_address.
> > > >
> > > > > + /* page remove kernel mapping for arm64
> > > > > */
> > > > > + set_memory_valid(kaddr, size >>
> > > > > PAGE_SHIFT,
> > > > > 0);
> > > > > + }
> > > >
> > > > But more importantly: set_memory_valid only exists on arm64,
> > > > this
> > > > will break compile everywhere else. And this API is complete
> > > > crap.
> > > > Passing kernel virtual addresses as unsigned long just sucks,
> > > > and
> > > > passing an integer argument for valid/non-valid also is a
> > > > horrible
> > > > API.
> > > >
> > >
> > > ... and as I pointed out before, you can still pass rodata=off on
> > > arm64, and get the old behavior, in which case bad things will
> > > happen
> > > if you try to use an API that expects to operate on page mappings
> > > with
> > > a 1 GB block mapping.
> > >
> >
> > Thanks for your suggestion.
> >
> >
> > > And you still haven't explained what the actual problem is: is
> > > this
> > > about CPU speculation corrupting non-cache coherent inbound DMA?
> >
> > No corrupiton, only cpu read it, we hope to fix the behavior.
> >
>
> Fix which behavior? Please explain
>
> 1) the current behavior
We call dma_direct_alloc() with DMA_ATTR_NO_KERNEL_MAPPING to get the
allocated buffer and the kernel mapping is exist. Our goal is this
buffer doesn't allow to be accessed by cpu. Unfortunately, we see cpu
speculation to read it. So we need to fix it and don't use no-map the
way.

> 2) why the current behavior is problematic for you
dma_direct_alloc() with DMA_ATTR_NO_KERNEL_MAPPING have kernel mapping,
so it still has cpu speculation read the buffer. Although we have
hardware to protect the buffer, we still hope use software to fix it.

> 3) how this patch changes the current behavior
When call dma_direct_alloc() with DMA_ATTR_NO_KERNEL_MAPPING, then
remove the kernel mapping which belong to the buffer.

> 4) why the new behavior fixes your problem.
If I understand correctly, want to block cpu speculation, then need
unmap the buffer at stage 1 and stage 2 page table and tlb invalidate.
This patch is to do stage 1 unmap at EL1.

>
> There is no penalty for using too many words.

Thanks.
Walter