Re: [PATCH v1] drm/ttm: Refcount allocated tail pages

From: Daniel Vetter
Date: Tue Sep 06 2022 - 16:07:28 EST

Next message: Jeff Layton: "Re: [RFC PATCH v2] statx, inode: document the new STATX_INO_VERSION field"
Previous message: Andrey Grodzovsky: "Re: [PATCH v2 1/4] drm/sched: Enable signaling for finished fence"
Next in thread: Daniel Vetter: "Re: [PATCH v1] drm/ttm: Refcount allocated tail pages"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, Aug 15, 2022 at 12:05:19PM +0200, Christian König wrote:
> Am 15.08.22 um 11:54 schrieb Dmitry Osipenko:
> > Higher order pages allocated using alloc_pages() aren't refcounted and they
> > need to be refcounted, otherwise it's impossible to map them by KVM. This
> > patch sets the refcount of the tail pages and fixes the KVM memory mapping
> > faults.
> >
> > Without this change guest virgl driver can't map host buffers into guest
> > and can't provide OpenGL 4.5 profile support to the guest. The host
> > mappings are also needed for enabling the Venus driver using host GPU
> > drivers that are utilizing TTM.
> >
> > Based on a patch proposed by Trigger Huang.
>
> Well I can't count how often I have repeated this: This is an absolutely
> clear NAK!
>
> TTM pages are not reference counted in the first place and because of this
> giving them to virgl is illegal.
>
> Please immediately stop this completely broken approach. We have discussed
> this multiple times now.

Yeah we need to get this stuff closed for real by tagging them all with
VM_IO or VM_PFNMAP asap.

It seems ot be a recurring amount of fun that people try to mmap dma-buf
and then call get_user_pages on them.

Which just doesn't work. I guess this is also why Rob Clark send out that
dma-buf patch to expos mapping information (i.e. wc vs wb vs uncached).

There seems to be some serious bonghits going on :-/
-Daniel

>
> Regards,
> Christian.
>
> >
> > Cc: stable@xxxxxxxxxxxxxxx
> > Cc: Trigger Huang <Trigger.Huang@xxxxxxxxx>
> > Link: https://www.collabora.com/news-and-blog/blog/2021/11/26/venus-on-qemu-enabling-new-virtual-vulkan-driver/#qcom1343
> > Tested-by: Dmitry Osipenko <dmitry.osipenko@xxxxxxxxxxxxx> # AMDGPU (Qemu and crosvm)
> > Signed-off-by: Dmitry Osipenko <dmitry.osipenko@xxxxxxxxxxxxx>
> > ---
> > drivers/gpu/drm/ttm/ttm_pool.c | 25 ++++++++++++++++++++++++-
> > 1 file changed, 24 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
> > index 21b61631f73a..11e92bb149c9 100644
> > --- a/drivers/gpu/drm/ttm/ttm_pool.c
> > +++ b/drivers/gpu/drm/ttm/ttm_pool.c
> > @@ -81,6 +81,7 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
> > unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS;
> > struct ttm_pool_dma *dma;
> > struct page *p;
> > + unsigned int i;
> > void *vaddr;
> > /* Don't set the __GFP_COMP flag for higher order allocations.
> > @@ -93,8 +94,10 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
> > if (!pool->use_dma_alloc) {
> > p = alloc_pages(gfp_flags, order);
> > - if (p)
> > + if (p) {
> > p->private = order;
> > + goto ref_tail_pages;
> > + }
> > return p;
> > }
> > @@ -120,6 +123,23 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
> > dma->vaddr = (unsigned long)vaddr | order;
> > p->private = (unsigned long)dma;
> > +
> > +ref_tail_pages:
> > + /*
> > + * KVM requires mapped tail pages to be refcounted because put_page()
> > + * is invoked on them in the end of the page fault handling, and thus,
> > + * tail pages need to be protected from the premature releasing.
> > + * In fact, KVM page fault handler refuses to map tail pages to guest
> > + * if they aren't refcounted because hva_to_pfn_remapped() checks the
> > + * refcount specifically for this case.
> > + *
> > + * In particular, unreferenced tail pages result in a KVM "Bad address"
> > + * failure for VMMs that use VirtIO-GPU when guest's Mesa VirGL driver
> > + * accesses mapped host TTM buffer that contains tail pages.
> > + */
> > + for (i = 1; i < 1 << order; i++)
> > + page_ref_inc(p + i);
> > +
> > return p;
> > error_free:
> > @@ -133,6 +153,7 @@ static void ttm_pool_free_page(struct ttm_pool *pool, enum ttm_caching caching,
> > {
> > unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS;
> > struct ttm_pool_dma *dma;
> > + unsigned int i;
> > void *vaddr;
> > #ifdef CONFIG_X86
> > @@ -142,6 +163,8 @@ static void ttm_pool_free_page(struct ttm_pool *pool, enum ttm_caching caching,
> > if (caching != ttm_cached && !PageHighMem(p))
> > set_pages_wb(p, 1 << order);
> > #endif
> > + for (i = 1; i < 1 << order; i++)
> > + page_ref_dec(p + i);
> > if (!pool || !pool->use_dma_alloc) {
> > __free_pages(p, order);
>

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Next message: Jeff Layton: "Re: [RFC PATCH v2] statx, inode: document the new STATX_INO_VERSION field"
Previous message: Andrey Grodzovsky: "Re: [PATCH v2 1/4] drm/sched: Enable signaling for finished fence"
Next in thread: Daniel Vetter: "Re: [PATCH v1] drm/ttm: Refcount allocated tail pages"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]