Re: [REGRESSION] BUG: KFENCE: memory corruption in drm_gem_put_pages+0x186/0x250

From: Matthew Wilcox
Date: Thu Oct 05 2023 - 09:57:02 EST


On Thu, Oct 05, 2023 at 09:56:03AM +0200, Oleksandr Natalenko wrote:
> Hello.
>
> On čtvrtek 5. října 2023 9:44:42 CEST Thomas Zimmermann wrote:
> > Hi
> >
> > Am 02.10.23 um 17:38 schrieb Oleksandr Natalenko:
> > > On pondělí 2. října 2023 16:32:45 CEST Matthew Wilcox wrote:
> > >> On Mon, Oct 02, 2023 at 01:02:52PM +0200, Oleksandr Natalenko wrote:
> > >>>>>>> BUG: KFENCE: memory corruption in drm_gem_put_pages+0x186/0x250
> > >>>>>>>
> > >>>>>>> Corrupted memory at 0x00000000e173a294 [ ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ] (in kfence-#108):
> > >>>>>>> drm_gem_put_pages+0x186/0x250
> > >>>>>>> drm_gem_shmem_put_pages_locked+0x43/0xc0
> > >>>>>>> drm_gem_shmem_object_vunmap+0x83/0xe0
> > >>>>>>> drm_gem_vunmap_unlocked+0x46/0xb0
> > >>>>>>> drm_fbdev_generic_helper_fb_dirty+0x1dc/0x310
> > >>>>>>> drm_fb_helper_damage_work+0x96/0x170
> > >>>
> > >>> Matthew, before I start dancing around, do you think ^^ could have the same cause as 0b62af28f249b9c4036a05acfb053058dc02e2e2 which got fixed by 863a8eb3f27098b42772f668e3977ff4cae10b04?
> > >>
> > >> Yes, entirely plausible. I think you have two useful points to look at
> > >> before delving into a full bisect -- 863a8e and the parent of 0b62af.
> > >> If either of them work, I think you have no more work to do.
> > >
> > > OK, I've did this against v6.5.5:
> > >
> > > ```
> > > git log --oneline HEAD~3..
> > > 7c1e7695ca9b8 (HEAD -> test) Revert "mm: remove struct pagevec"
> > > 8f2ad53b6eac6 Revert "mm: remove check_move_unevictable_pages()"
> > > fa1e3c0b5453c Revert "drm: convert drm_gem_put_pages() to use a folio_batch"
> > > ```
> > >
> > > then rebooted the host multiple times, and the issue is not seen any more.
> > >
> > > So I guess 3291e09a463870610b8227f32b16b19a587edf33 is the culprit.
> >
> > Ignore my other email. It's apparently been fixed already. Thanks!
>
> Has it? I think I was able to identify offending commit, but I'm not aware of any fix to that.

I don't understand; you said reverting those DRM commits fixed the
problem, so 863a8eb3f270 is the solution. No?