Re: kernel BUG at mm/shmem.c:LINE!

From: Matthew Wilcox
Date: Mon Jul 23 2018 - 10:01:55 EST


On Sun, Jul 22, 2018 at 07:28:01PM -0700, Hugh Dickins wrote:
> Whether or not that fixed syzbot's kernel BUG at mm/shmem.c:815!
> I don't know, but I'm afraid it has not fixed linux-next breakage of
> huge tmpfs: I get a similar page_to_pgoff BUG at mm/filemap.c:1466!
>
> Please try something like
> mount -o remount,huge=always /dev/shm
> cp /dev/zero /dev/shm
>
> Writing soon crashes in find_lock_entry(), looking up offset 0x201
> but getting the page for offset 0x3c1 instead.

Hmm. I don't see a crash while running that command, but I do see an RCU
stall in find_get_entries() called from shmem_undo_range() when running
'cp' the second time -- ie while truncating the /dev/shm/zero file.
Maybe I'm seeing the same bug as you, and maybe I'm seeing a different
one. Do we have a shmem test suite somewhere?

> I've spent a while on it, but better turn over to you, Matthew:
> my guess is that xas_create_range() does not create the layout
> you expect from it.

I've dumped the XArray tree on my machine and it actually looks fine
*except* that the pages pointed to are free! That indicates to me I
screwed up somebody's reference count somewhere.