Re: [PATCH] drm/amdgpu: check vm bo eviction valuable at last

From: Qiang Yu
Date: Fri Feb 18 2022 - 03:58:56 EST


On Fri, Feb 18, 2022 at 3:46 PM Christian König
<ckoenig.leichtzumerken@xxxxxxxxx> wrote:
>
> Am 18.02.22 um 04:08 schrieb Qiang Yu:
> > On Thu, Feb 17, 2022 at 8:22 PM Christian König
> > <christian.koenig@xxxxxxx> wrote:
> >> Am 17.02.22 um 11:58 schrieb Qiang Yu:
> >>> On Thu, Feb 17, 2022 at 6:39 PM Christian König
> >>> <christian.koenig@xxxxxxx> wrote:
> >>>>
> >>>> Am 17.02.22 um 11:13 schrieb Qiang Yu:
> >>>>> On Thu, Feb 17, 2022 at 5:46 PM Christian König
> >>>>> <christian.koenig@xxxxxxx> wrote:
> >>>>>> Am 17.02.22 um 10:40 schrieb Qiang Yu:
> >>>>>>> On Thu, Feb 17, 2022 at 5:15 PM Christian König
> >>>>>>> <christian.koenig@xxxxxxx> wrote:
> >>>>>>>> Am 17.02.22 um 10:04 schrieb Qiang Yu:
> >>>>>>>>> Workstation application ANSA/META get this error dmesg:
> >>>>>>>>> [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16)
> >>>>>>>>>
> >>>>>>>>> This is caused by:
> >>>>>>>>> 1. create a 256MB buffer in invisible VRAM
> >>>>>>>>> 2. CPU map the buffer and access it causes vm_fault and try to move
> >>>>>>>>> it to visible VRAM
> >>>>>>>>> 3. force visible VRAM space and traverse all VRAM bos to check if
> >>>>>>>>> evicting this bo is valuable
> >>>>>>>>> 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable()
> >>>>>>>>> will set amdgpu_vm->evicting, but latter due to not in visible
> >>>>>>>>> VRAM, won't really evict it so not add it to amdgpu_vm->evicted
> >>>>>>>>> 5. before next CS to clear the amdgpu_vm->evicting, user VM ops
> >>>>>>>>> ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted)
> >>>>>>>>> but fail in amdgpu_vm_bo_update_mapping() (check
> >>>>>>>>> amdgpu_vm->evicting) and get this error log
> >>>>>>>>>
> >>>>>>>>> This error won't affect functionality as next CS will finish the
> >>>>>>>>> waiting VM ops. But we'd better make the amdgpu_vm->evicting
> >>>>>>>>> correctly reflact the vm status and clear the error log.
> >>>>>>>> Well NAK, that is intentional behavior.
> >>>>>>>>
> >>>>>>>> The VM page tables where considered for eviction, so setting the flag is
> >>>>>>>> correct even when the page tables later on are not actually evicted.
> >>>>>>>>
> >>>>>>> But this will unnecessarily stop latter user VM ops in ioctl before CS
> >>>>>>> even when the VM bos are not evicted.
> >>>>>>> Won't this have any negative effect when could do better?
> >>>>>> No, this will have a positive effect. See the VM was already considered
> >>>>>> for eviction because it is idle.
> >>>>>>
> >>>>>> Updating it immediately doesn't necessarily make sense, we should wait
> >>>>>> with that until its next usage.
> >>>>>>
> >>>>>> Additional to that this patch doesn't really fix the problem, it just
> >>>>>> mitigates it.
> >>>>>>
> >>>>>> Eviction can fail later on for a couple of reasons and we absolutely
> >>>>>> need to check the flag instead of the list in amdgpu_vm_ready().
> >>>>> The flag only for both flag and list? Looks like should be both as
> >>>>> the list indicate some vm page table need to be updated and could
> >>>>> delay the user update with the same logic as you described above.
> >>>> I think checking the flag should be enough. The issue is that the list
> >>>> was there initially, but to avoid race conditions we added the flag with
> >>>> separate lock protection later on.
> >>>>
> >>> But list and flag does not align always, there are cases like
> >>> list-empty/flag-set (this problem) and list-non-empty/flag-unset (non-vm bo
> >>> eviction). If only check flag list-non-empty/flag-unset change behavior.
> >> Yeah, but I think that the flag unset list-non-empty case would be
> >> correctly handled if we only test the flag.
> >>
> >> In other words we can update the page tables as long as they are not
> >> partially or fully evicted and that's not the case when non-vm BOs are
> >> evicted.
> >>
> > This sounds like two standard for the same thing, because this problem
> > does not evict page tables too. But I see your point is:
> > There's a difference that this problem's case can make sure vm is idle,
> > and we prefer to delay vm updates when vm is idle.
> >
> > If so, why not just stop user vm update by checking vm busy in
> > amdgpu_gem_va_ioctl() to skip amdgpu_gem_va_update_vm()?
>
> That's exactly what amdgpu_gem_va_update_vm() is doing by calling
> amdgpu_vm_ready(). The problem is that amdgpu_vm_ready() looks at the
> wrong thing.
>
If amdgpu_vm_ready() use evicting flag, it's still not equivalent to check
vm idle: true -> vm idle, false -> vm may be idle or busy.

> > Then we can keep the evicting flag accurate (after solving your
> > concern for this patch that eviction may fail latter by further delay
> > the flag update after eviction success).
>
> That won't work. See we need to mark the VM as evicted before we
> actually evict them because otherwise somebody could use the VM in
> parallel and add another fence to it.
>
I see, make this too accurate should cost too much like holding the
eviction_lock when eviction. But just delay it in
amdgpu_ttm_bo_eviction_valuable()
could avoid most false positive case.

Regards,
Qiang

> Regards,
> Christian.
>
> >
> > Regards,
> > Qiang
> >
> >
> >> Regards,
> >> Christian.
> >>
> >>> Regards,
> >>> Qiang
> >>>
> >>>> Regards,
> >>>> Christian.
> >>>>
> >>>>> Regards,
> >>>>> Qiang
> >>>>>
> >>>>>> Regards,
> >>>>>> Christian.
> >>>>>>
> >>>>>>> Regards,
> >>>>>>> Qiang
> >>>>>>>
> >>>>>>>> What we should rather do is to fix amdgpu_vm_ready() to take a look at
> >>>>>>>> the flag instead of the linked list.
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Christian.
> >>>>>>>>
> >>>>>>>>> Signed-off-by: Qiang Yu <qiang.yu@xxxxxxx>
> >>>>>>>>> ---
> >>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 85 ++++++++++++++-----------
> >>>>>>>>> 1 file changed, 47 insertions(+), 38 deletions(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>>>>>>> index 5a32ee66d8c8..88a27911054f 100644
> >>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> >>>>>>>>> @@ -1306,45 +1306,11 @@ uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt *ttm,
> >>>>>>>>> return flags;
> >>>>>>>>> }
> >>>>>>>>>
> >>>>>>>>> -/*
> >>>>>>>>> - * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
> >>>>>>>>> - * object.
> >>>>>>>>> - *
> >>>>>>>>> - * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
> >>>>>>>>> - * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
> >>>>>>>>> - * it can find space for a new object and by ttm_bo_force_list_clean() which is
> >>>>>>>>> - * used to clean out a memory space.
> >>>>>>>>> - */
> >>>>>>>>> -static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >>>>>>>>> - const struct ttm_place *place)
> >>>>>>>>> +static bool amdgpu_ttm_mem_eviction_valuable(struct ttm_buffer_object *bo,
> >>>>>>>>> + const struct ttm_place *place)
> >>>>>>>>> {
> >>>>>>>>> unsigned long num_pages = bo->resource->num_pages;
> >>>>>>>>> struct amdgpu_res_cursor cursor;
> >>>>>>>>> - struct dma_resv_list *flist;
> >>>>>>>>> - struct dma_fence *f;
> >>>>>>>>> - int i;
> >>>>>>>>> -
> >>>>>>>>> - /* Swapout? */
> >>>>>>>>> - if (bo->resource->mem_type == TTM_PL_SYSTEM)
> >>>>>>>>> - return true;
> >>>>>>>>> -
> >>>>>>>>> - if (bo->type == ttm_bo_type_kernel &&
> >>>>>>>>> - !amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo)))
> >>>>>>>>> - return false;
> >>>>>>>>> -
> >>>>>>>>> - /* If bo is a KFD BO, check if the bo belongs to the current process.
> >>>>>>>>> - * If true, then return false as any KFD process needs all its BOs to
> >>>>>>>>> - * be resident to run successfully
> >>>>>>>>> - */
> >>>>>>>>> - flist = dma_resv_shared_list(bo->base.resv);
> >>>>>>>>> - if (flist) {
> >>>>>>>>> - for (i = 0; i < flist->shared_count; ++i) {
> >>>>>>>>> - f = rcu_dereference_protected(flist->shared[i],
> >>>>>>>>> - dma_resv_held(bo->base.resv));
> >>>>>>>>> - if (amdkfd_fence_check_mm(f, current->mm))
> >>>>>>>>> - return false;
> >>>>>>>>> - }
> >>>>>>>>> - }
> >>>>>>>>>
> >>>>>>>>> switch (bo->resource->mem_type) {
> >>>>>>>>> case AMDGPU_PL_PREEMPT:
> >>>>>>>>> @@ -1377,10 +1343,53 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >>>>>>>>> return false;
> >>>>>>>>>
> >>>>>>>>> default:
> >>>>>>>>> - break;
> >>>>>>>>> + return ttm_bo_eviction_valuable(bo, place);
> >>>>>>>>> }
> >>>>>>>>> +}
> >>>>>>>>>
> >>>>>>>>> - return ttm_bo_eviction_valuable(bo, place);
> >>>>>>>>> +/*
> >>>>>>>>> + * amdgpu_ttm_bo_eviction_valuable - Check to see if we can evict a buffer
> >>>>>>>>> + * object.
> >>>>>>>>> + *
> >>>>>>>>> + * Return true if eviction is sensible. Called by ttm_mem_evict_first() on
> >>>>>>>>> + * behalf of ttm_bo_mem_force_space() which tries to evict buffer objects until
> >>>>>>>>> + * it can find space for a new object and by ttm_bo_force_list_clean() which is
> >>>>>>>>> + * used to clean out a memory space.
> >>>>>>>>> + */
> >>>>>>>>> +static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
> >>>>>>>>> + const struct ttm_place *place)
> >>>>>>>>> +{
> >>>>>>>>> + struct dma_resv_list *flist;
> >>>>>>>>> + struct dma_fence *f;
> >>>>>>>>> + int i;
> >>>>>>>>> +
> >>>>>>>>> + /* Swapout? */
> >>>>>>>>> + if (bo->resource->mem_type == TTM_PL_SYSTEM)
> >>>>>>>>> + return true;
> >>>>>>>>> +
> >>>>>>>>> + /* If bo is a KFD BO, check if the bo belongs to the current process.
> >>>>>>>>> + * If true, then return false as any KFD process needs all its BOs to
> >>>>>>>>> + * be resident to run successfully
> >>>>>>>>> + */
> >>>>>>>>> + flist = dma_resv_shared_list(bo->base.resv);
> >>>>>>>>> + if (flist) {
> >>>>>>>>> + for (i = 0; i < flist->shared_count; ++i) {
> >>>>>>>>> + f = rcu_dereference_protected(flist->shared[i],
> >>>>>>>>> + dma_resv_held(bo->base.resv));
> >>>>>>>>> + if (amdkfd_fence_check_mm(f, current->mm))
> >>>>>>>>> + return false;
> >>>>>>>>> + }
> >>>>>>>>> + }
> >>>>>>>>> +
> >>>>>>>>> + /* Check by different mem type. */
> >>>>>>>>> + if (!amdgpu_ttm_mem_eviction_valuable(bo, place))
> >>>>>>>>> + return false;
> >>>>>>>>> +
> >>>>>>>>> + /* VM bo should be checked at last because it will mark VM evicting. */
> >>>>>>>>> + if (bo->type == ttm_bo_type_kernel)
> >>>>>>>>> + return amdgpu_vm_evictable(ttm_to_amdgpu_bo(bo));
> >>>>>>>>> +
> >>>>>>>>> + return true;
> >>>>>>>>> }
> >>>>>>>>>
> >>>>>>>>> static void amdgpu_ttm_vram_mm_access(struct amdgpu_device *adev, loff_t pos,
>