Re: amusing SLUB compaction bug when CC_OPTIMIZE_FOR_SIZE

From: Hyeonggon Yoo
Date: Tue Oct 04 2022 - 10:26:46 EST


On Mon, Oct 03, 2022 at 06:00:35PM +0100, Matthew Wilcox wrote:
> On Sun, Oct 02, 2022 at 02:48:02PM +0900, Hyeonggon Yoo wrote:
> > Just one more thing, rcu_leak_callback too. RCU seem to use it
> > internally to catch double call_rcu().
> >
> > And some suggestions:
> > - what about adding runtime WARN() on slab init code to catch
> > unexpected arch/toolchain issues?
> > - instead of 4, we may use macro definition? like (PAGE_MAPPING_FLAGS + 1)?
>
> I think the real problem here is that isolate_movable_page() is
> insufficiently paranoid. Looking at the gyrations that GUP and the
> page cache do to convince themselves that the page they got really is
> the page they wanted, there are a few missing pieces (eg checking that
> you actually got a refcount on _this_ page and not some random other
> page you were temporarily part of a compound page with).
>
> This patch does three things:
>
> - Turns one of the comments into English. There are some others
> which I'm still scratching my head over.
> - Uses a folio to help distinguish which operations are being done
> to the head vs the specific page (this is somewhat an abuse of the
> folio concept, but it's acceptable)
> - Add the aforementioned check that we're actually operating on the
> page that we think we want to be.
> - Add a check that the folio isn't secretly a slab.
>
> We could put the slab check in PageMapping and call it after taking
> the folio lock, but that seems pointless

I partially agree with this patch. I actually like it.

> It's the acquisition of
> the refcount which stabilises the slab flag, not holding the lock.

But can you please elaborate how this prevents race between
allocation & initialization of a slab and isolate_movable_page()?

Or maybe we can handle it with frozen folio as Vlastimil suggested? ;-)

> diff --git a/mm/migrate.c b/mm/migrate.c
> index 6a1597c92261..a65598308c83 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -59,6 +59,7 @@
>
> int isolate_movable_page(struct page *page, isolate_mode_t mode)
> {
> + struct folio *folio = page_folio(page);
> const struct movable_operations *mops;
>
> /*
> @@ -70,16 +71,23 @@ int isolate_movable_page(struct page *page, isolate_mode_t mode)
> * the put_page() at the end of this block will take care of
> * release this page, thus avoiding a nasty leakage.
> */
> - if (unlikely(!get_page_unless_zero(page)))
> + if (unlikely(!folio_try_get(folio)))
> goto out;
>
> + /* Recheck the page is still part of the folio we just got */
> + if (unlikely(page_folio(page) != folio))
> + goto out_put;
> +
> /*
> - * Check PageMovable before holding a PG_lock because page's owner
> - * assumes anybody doesn't touch PG_lock of newly allocated page
> - * so unconditionally grabbing the lock ruins page's owner side.
> + * Check movable flag before taking the folio lock because
> + * we use non-atomic bitops on newly allocated page flags so
> + * unconditionally grabbing the lock ruins page's owner side.
> */
> - if (unlikely(!__PageMovable(page)))
> - goto out_putpage;
> + if (unlikely(!__folio_test_movable(folio)))
> + goto out_put;
> + if (unlikely(folio_test_slab(folio)))
> + goto out_put;
> +
> /*
> * As movable pages are not isolated from LRU lists, concurrent
> * compaction threads can race against page migration functions
> @@ -91,8 +99,8 @@ int isolate_movable_page(struct page *page, isolate_mode_t mode)
> * lets be sure we have the page lock
> * before proceeding with the movable page isolation steps.
> */
> - if (unlikely(!trylock_page(page)))
> - goto out_putpage;
> + if (unlikely(!folio_trylock(folio)))
> + goto out_put;

I don't know much about callers that this is trying to avoid race aginst...

But for this to make sense, I think *every users* that doing their stuff with
sub-page of a compound page should acquire folio lock and not page lock
of sub-page, right?


> if (!PageMovable(page) || PageIsolated(page))
> goto out_no_isolated;
> @@ -106,14 +114,14 @@ int isolate_movable_page(struct page *page, isolate_mode_t mode)
> /* Driver shouldn't use PG_isolated bit of page->flags */
> WARN_ON_ONCE(PageIsolated(page));
> SetPageIsolated(page);
> - unlock_page(page);
> + folio_unlock(folio);
>
> return 0;
>
> out_no_isolated:
> - unlock_page(page);
> -out_putpage:
> - put_page(page);
> + folio_unlock(folio);
> +out_put:
> + folio_put(folio);
> out:
> return -EBUSY;
> }

--
Thanks,
Hyeonggon