Re: Performance regressions in "boot_time" tests in Linux 5.8 Kernel

From: Rahul Gopakumar
Date: Tue Nov 24 2020 - 10:08:52 EST


Hi Baoquan,

We applied the new patch to 5.10 rc3 and tested it. We are still
observing the same page corruption issue which we saw with the
old patch. This is causing 3 secs delay in boot time.

Attached dmesg log from the new patch and also from vanilla
5.10 rc3 kernel.

There are multiple lines like below in the dmesg log of the
new patch.

"BUG: Bad page state in process swapper pfn:ab08001"

________________________________________
From: bhe@xxxxxxxxxx <bhe@xxxxxxxxxx>
Sent: 22 November 2020 6:38 AM
To: Rahul Gopakumar
Cc: linux-mm@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx; natechancellor@xxxxxxxxx; ndesaulniers@xxxxxxxxxx; clang-built-linux@xxxxxxxxxxxxxxxx; rostedt@xxxxxxxxxxx; Rajender M; Yiu Cho Lau; Peter Jonasson; Venkatesh Rajaram
Subject: Re: Performance regressions in "boot_time" tests in Linux 5.8 Kernel

On 11/20/20 at 03:11am, Rahul Gopakumar wrote:
> Hi Baoquan,
>
> To which commit should we apply the draft patch. We tried applying
> the patch to the commit 3e4fb4346c781068610d03c12b16c0cfb0fd24a3
> (the one we used for applying the previous patch) but it fails.

I tested on 5.10-rc3+. You can append below change to the old patch in
your testing kernel.

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index fa6076e1a840..5e5b74e88d69 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -448,6 +448,8 @@ defer_init(int nid, unsigned long pfn, unsigned long end_pfn)
if (end_pfn < pgdat_end_pfn(NODE_DATA(nid)))
return false;

+ if (NODE_DATA(nid)->first_deferred_pfn != ULONG_MAX)
+ return true;
/*
* We start only with one section of pages, more pages are added as
* needed until the rest of deferred pages are initialized.

Attachment: new-patch-dmesg.log
Description: new-patch-dmesg.log

Attachment: vanilla-dmesg.log
Description: vanilla-dmesg.log