Re: Xen-unstable Linux-6.1.0-rc5 BUG: unable to handle page fault for address: ffff8880083374d0

From: Juergen Gross
Date: Mon Nov 21 2022 - 02:10:18 EST


On 19.11.22 09:28, Sander Eikelenboom wrote:
Hi Yu / Juergen,

This night I got a dom0 kernel crash on my new Ryzen box running Xen-unstable and a Linux-6.1.0-rc5 kernel.
I did enable the new and shiny MGLRU, could this be related ?

It might be related, but I think it could happen independently from it.

Nov 19 06:30:11 serveerstertje kernel: [68959.647371] BUG: unable to handle page fault for address: ffff8880083374d0
Nov 19 06:30:11 serveerstertje kernel: [68959.663555] #PF: supervisor write access in kernel mode
Nov 19 06:30:11 serveerstertje kernel: [68959.677542] #PF: error_code(0x0003) - permissions violation
Nov 19 06:30:11 serveerstertje kernel: [68959.691181] PGD 3026067 P4D 3026067 PUD 3027067 PMD 7fee5067 PTE 8010000008337065
Nov 19 06:30:11 serveerstertje kernel: [68959.705084] Oops: 0003 [#1] PREEMPT SMP NOPTI
Nov 19 06:30:11 serveerstertje kernel: [68959.718710] CPU: 7 PID: 158 Comm: kswapd0 Not tainted 6.1.0-rc5-20221118-doflr-mac80211debug+ #1
Nov 19 06:30:11 serveerstertje kernel: [68959.732457] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450 Pro4 R2.0, BIOS P5.60 10/20/2022
Nov 19 06:30:11 serveerstertje kernel: [68959.746391] RIP: e030:pmdp_test_and_clear_young+0x25/0x40

The kernel tired to reset the "accessed" bit in the pmd entry.

It does so only since commit eed9a328aa1ae. Before that
pmdp_test_and_clear_young() could be called only for huge pages, which are
disabled in Xen PV guests.

pmdp_test_and_clear_young() does a test_and_clear_bit() of the pmd entry, which
is failing since the hypervisor is emulating pte entry modifications only (pmd
and pud entries can be set via hypercalls only).

Could you please test the attached patch whether it fixes the issue for you?


Juergen

From e89ea813cc09ca7c31af81a87b4856cd3eba3ab9 Mon Sep 17 00:00:00 2001
From: Juergen Gross <jgross@xxxxxxxx>
Date: Mon, 21 Nov 2022 07:41:14 +0100
Subject: [PATCH] x86/mm: fix pmdp_test_and_clear_young() for Xen PV guests

When running as a Xen PV guests commit eed9a328aa1a ("mm: x86: add
CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG") can cause a protection violation
in pmdp_test_and_clear_young():

BUG: unable to handle page fault for address: ffff8880083374d0
#PF: supervisor write access in kernel mode
#PF: error_code(0x0003) - permissions violation
PGD 3026067 P4D 3026067 PUD 3027067 PMD 7fee5067 PTE 8010000008337065
Oops: 0003 [#1] PREEMPT SMP NOPTI
CPU: 7 PID: 158 Comm: kswapd0 Not tainted 6.1.0-rc5-20221118-doflr+ #1
RIP: e030:pmdp_test_and_clear_young+0x25/0x40

This happens because the Xen hypervisor can't emulate direct writes to
page table entries other than PTEs.

In order to fix that do the PMD access bit resetting only when not
running as a Xen PV guest. Note that PUD entries are no issue, as those
won't be written directly by the kernel when running as a Xen PV guest
due to transparent huge pages being disabled in that case.

Fixes: eed9a328aa1a ("mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG")
Reported-by: Sander Eikelenboom <linux@xxxxxxxxxxxxxx>
Signed-off-by: Juergen Gross <jgross@xxxxxxxx>
---
arch/x86/mm/pgtable.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 8525f2876fb4..076a99e77e28 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -556,9 +556,14 @@ int pmdp_test_and_clear_young(struct vm_area_struct *vma,
{
int ret = 0;

- if (pmd_young(*pmdp))
- ret = test_and_clear_bit(_PAGE_BIT_ACCESSED,
- (unsigned long *)pmdp);
+ if (pmd_young(*pmdp)) {
+ if (cpu_feature_enabled(X86_FEATURE_XENPV)) {
+ ret = 1;
+ } else {
+ ret = test_and_clear_bit(_PAGE_BIT_ACCESSED,
+ (unsigned long *)pmdp);
+ }
+ }

return ret;
}
--
2.35.3

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature