Re: [PATCH] mm/memory-failure: fix crash in split_huge_page_to_list from soft_offline_page

From: Matthew Wilcox
Date: Mon Jan 22 2024 - 09:37:11 EST


On Mon, Jan 22, 2024 at 08:57:06PM +0800, Miaohe Lin wrote:
> On 2024/1/21 10:00, Matthew Wilcox wrote:
> > On Sat, Jan 20, 2024 at 02:57:29PM +0800, Miaohe Lin wrote:
> >> {
> >> - /* Soft offline could migrate non-LRU movable pages */
> >> - if ((flags & MF_SOFT_OFFLINE) && __PageMovable(page))
> >> + /*
> >> + * Soft offline could migrate non-LRU movable pages.
> >> + * Note that page->mapping is overloaded with slab->slab_list or slabs
> >> + * fields which might make slab pages appear like non-LRU movable pages.
> >> + * So __PageMovable() has to be done after PageSlab() is checked.
> >> + */
> >> + if ((flags & MF_SOFT_OFFLINE) && !PageSlab(page) && __PageMovable(page))
> >> return true;
> >>
> >> return PageLRU(page) || is_free_buddy_page(page);
> >
> > I think would make more sense as
> >
> > + if (PageSlab(page))
> > + return false;
>
> Do you mean add PageSlab check above "if ((flags & MF_SOFT_OFFLINE) && __PageMovable(page))" block
> so we don't need to add more comment?

Yes, although not just that we don't need to add a comment.
Fundamentally, if you see PageSlab, you don't need to test anything
else, you know it's not migratable.

> I have a concern that __PageMovable() seems unreliable now if we access page from random context.
> This might introduce some potential problems. For example, offline_pages() might be stumped with
> such pages without any progress until signal occurs IIUC:
>
> offline_pages
> ..
> do {
> scan_movable_pages
> if (__PageMovable(page)) -- It might be slab page here. ret will also be set to 0.
> goto found;
> do_migrate_range -- Failed to isolate slab page and retry.
> } while (!ret) -- retry since ret is 0.
>
> There might be many similar scenes, but I haven't taken them more closely. Maybe these are
> just dumb problems...

Yep, lots of places are insufficiently careful about testing
PageMovable. This will get fixed with memdescs, but we're a fair way
from having memdescs ...