Re: [PATCH v19 22/30] drm/shmem-helper: Add common memory shrinker

From: Boris Brezillon
Date: Mon Jan 29 2024 - 04:06:54 EST


On Mon, 29 Jan 2024 09:55:11 +0100
Boris Brezillon <boris.brezillon@xxxxxxxxxxxxx> wrote:

> On Mon, 29 Jan 2024 09:16:04 +0300
> Dmitry Osipenko <dmitry.osipenko@xxxxxxxxxxxxx> wrote:
>
> > On 1/26/24 21:12, Boris Brezillon wrote:
> > > On Fri, 26 Jan 2024 19:27:49 +0300
> > > Dmitry Osipenko <dmitry.osipenko@xxxxxxxxxxxxx> wrote:
> > >
> > >> On 1/26/24 12:55, Boris Brezillon wrote:
> > >>> On Fri, 26 Jan 2024 00:56:47 +0300
> > >>> Dmitry Osipenko <dmitry.osipenko@xxxxxxxxxxxxx> wrote:
> > >>>
> > >>>> On 1/25/24 13:19, Boris Brezillon wrote:
> > >>>>> On Fri, 5 Jan 2024 21:46:16 +0300
> > >>>>> Dmitry Osipenko <dmitry.osipenko@xxxxxxxxxxxxx> wrote:
> > >>>>>
> > >>>>>> +static bool drm_gem_shmem_is_evictable(struct drm_gem_shmem_object *shmem)
> > >>>>>> +{
> > >>>>>> + return (shmem->madv >= 0) && shmem->base.funcs->evict &&
> > >>>>>> + refcount_read(&shmem->pages_use_count) &&
> > >>>>>> + !refcount_read(&shmem->pages_pin_count) &&
> > >>>>>> + !shmem->base.dma_buf && !shmem->base.import_attach &&
> > >>>>>> + !shmem->evicted;
> > >>>>>
> > >>>>> Are we missing
> > >>>>>
> > >>>>> && dma_resv_test_signaled(shmem->base.resv,
> > >>>>> DMA_RESV_USAGE_BOOKKEEP)
> > >>>>>
> > >>>>> to make sure the GPU is done using the BO?
> > >>>>> The same applies to drm_gem_shmem_is_purgeable() BTW.
> > >>>>>
> > >>>>> If you don't want to do this test here, we need a way to let drivers
> > >>>>> provide a custom is_{evictable,purgeable}() test.
> > >>>>>
> > >>>>> I guess we should also expose drm_gem_shmem_shrinker_update_lru_locked()
> > >>>>> to let drivers move the GEMs that were used most recently (those
> > >>>>> referenced by a GPU job) at the end of the evictable LRU.
> > >>>>
> > >>>> We have the signaled-check in the common drm_gem_evict() helper:
> > >>>>
> > >>>> https://elixir.bootlin.com/linux/v6.8-rc1/source/drivers/gpu/drm/drm_gem.c#L1496
> > >>>
> > >>> Ah, indeed. I'll need DMA_RESV_USAGE_BOOKKEEP instead of
> > >>> DMA_RESV_USAGE_READ in panthor, but I can add it in the driver specific
> > >>> ->evict() hook (though that means calling dma_resv_test_signaled()
> > >>> twice, which is not great, oh well).
> > >>
> > >> Maybe we should change drm_gem_evict() to use BOOKKEEP. The
> > >> test_signaled(BOOKKEEP) should be a "stronger" check than
> > >> test_signaled(READ)?
> > >
> > > It is, just wondering if some users have a good reason to want
> > > READ here.
> > >
> > >>
> > >>> The problem about the evictable LRU remains though: we need a way to let
> > >>> drivers put their BOs at the end of the list when the BO has been used
> > >>> by the GPU, don't we?
> > >>
> > >> If BO is use, then it won't be evicted, while idling BOs will be
> > >> evicted. Hence, the used BOs will be naturally moved down the LRU list
> > >> each time shrinker is invoked.
> > >>
> > >
> > > That only do the trick if the BOs being used most often are busy when
> > > the shrinker kicks in though. Let's take this scenario:
> > >
> > >
> > > BO 1 BO 2 shinker
> > >
> > > busy
> > > idle (first-pos-in-evictable-LRU)
> > >
> > > busy
> > > idle (second-pos-in-evictable-LRU)
> > >
> > > busy
> > > idle
> > >
> > > busy
> > > idle
> > >
> > > busy
> > > idle
> > >
> > > find a BO to evict
> > > pick BO 2
> > >
> > > busy (swapin)
> > > idle
> > >
> > > If the LRU had been updated at each busy event, BO 1 should have
> > > been picked for eviction. But we evicted the BO that was first
> > > recorded idle instead of the one that was least recently
> > > recorded busy.
> >
> > You have to swapin(BO) every time BO goes to busy state, and swapin does drm_gem_lru_move_tail(BO). Hence, each time BO goes idle->busy, it's moved down the LRU list.
>
> Ah, that's the bit I was missing. It makes sense now. I guess that's
> good enough for now, we can sort out the BOOKKEEP vs READ in a
> follow-up series.

On second look, it seems drm_gem_shmem_shrinker_update_lru_locked()
doesn't call drm_gem_shmem_shrinker_update_lru_locked() if the BO was
already resident? Is there something else I'm overlooking here?

>
> Reviewed-by: Boris Brezillon <boris.brezillon@xxxxxxxxxxxxx>