Re: [RESEND PATCH] mm: page_alloc: validate buddy before check the migratetype

From: Guo Ren
Date: Mon Jun 13 2022 - 15:02:45 EST


On Mon, Jun 13, 2022 at 11:23 PM Zi Yan <ziy@xxxxxxxxxx> wrote:
>
> Hi Xianting,
>
> Thanks for your patch.
>
> On 13 Jun 2022, at 9:10, Xianting Tian wrote:
>
> > Commit 787af64d05cd ("mm: page_alloc: validate buddy before check its migratetype.")
> > added buddy check code. But unfortunately, this fix isn't backported to
> > linux-5.17.y and the former stable branches. The reason is it added wrong
> > fixes message:
> > Fixes: 1dd214b8f21c ("mm: page_alloc: avoid merging non-fallbackable
> > pageblocks with others")
>
> No, the Fixes tag is right. The commit above does need to validate buddy.
I think Xianting is right. The “Fixes:" tag is not accurate and the
page_is_buddy() is necessary here.

This patch could be applied to the early version of the stable tree
(eg: Linux-5.10.y, not the master tree)

>
> > Actually, this issue is involved by commit:
> > commit d9dddbf55667 ("mm/page_alloc: prevent merging between isolated and other pageblocks")
> >
> > For RISC-V arch, the first 2M is reserved for sbi, so the start PFN is 512,
> > but it got buddy PFN 0 for PFN 0x2000:
> > 0 = 0x2000 ^ (1 << 12)
> > With the illegal buddy PFN 0, it got an illegal buddy page, which caused
> > crash in __get_pfnblock_flags_mask().
>
> It seems that the RISC-V arch reveals a similar bug from d9dddbf55667.
> Basically, this bug will only happen when PFN=0x2000 is merging up and
> there are some isolated pageblocks.
Not PFN=0x2000, it's PFN=0x1000, I guess.

RISC-V's first 2MB RAM could reserve for opensbi, so it would have
riscv_pfn_base=512 and mem_map began with 512th PFN when
CONFIG_FLATMEM=y.
(Also, csky has the same issue: a non-zero pfn_base in some scenarios.)

But __find_buddy_pfn algorithm thinks the start address is 0, it could
get 0 pfn or less than the pfn_base value. We need another check to
prevent that.

>
> BTW, what does first reserved 2MB imply? All 4KB pages from first 2MB are
> set to PageReserved?
>
> >
> > With the patch, it can avoid the calling of get_pageblock_migratetype() if
> > it isn't buddy page.
>
> You might miss the __find_buddy_pfn() caller in unset_migratetype_isolate()
> from mm/page_isolation.c, if you are talking about linux-5.17.y and former
> version. There, page_is_buddy() is also not called and is_migrate_isolate_page()
> is called, which calls get_pageblock_migratetype() too.
>
> >
> > Fixes: d9dddbf55667 ("mm/page_alloc: prevent merging between isolated and other pageblocks")
> > Cc: stable@xxxxxxxxxxxxxxx
> > Reported-by: zjb194813@xxxxxxxxxxxxxxx
> > Reported-by: tianhu.hh@xxxxxxxxxxxxxxx
> > Signed-off-by: Xianting Tian <xianting.tian@xxxxxxxxxxxxxxxxx>
> > ---
> > mm/page_alloc.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index b1caa1c6c887..5b423caa68fd 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -1129,6 +1129,9 @@ static inline void __free_one_page(struct page *page,
> >
> > buddy_pfn = __find_buddy_pfn(pfn, order);
> > buddy = page + (buddy_pfn - pfn);
> > +
> > + if (!page_is_buddy(page, buddy, order))
> > + goto done_merging;
> > buddy_mt = get_pageblock_migratetype(buddy);
> >
> > if (migratetype != buddy_mt
> > --
> > 2.17.1
>
> --
> Best Regards,
> Yan, Zi



--
Best Regards
Guo Ren

ML: https://lore.kernel.org/linux-csky/