Re: [PATCH] mm, kmsan: fix infinite recursion due to RCU critical section

From: Marco Elver
Date: Thu Jan 18 2024 - 06:08:25 EST


On Thu, 18 Jan 2024 at 12:00, Marco Elver <elver@xxxxxxxxxx> wrote:
>
> Alexander Potapenko writes in [1]: "For every memory access in the code
> instrumented by KMSAN we call kmsan_get_metadata() to obtain the
> metadata for the memory being accessed. For virtual memory the metadata
> pointers are stored in the corresponding `struct page`, therefore we
> need to call virt_to_page() to get them.
>
> According to the comment in arch/x86/include/asm/page.h,
> virt_to_page(kaddr) returns a valid pointer iff virt_addr_valid(kaddr)
> is true, so KMSAN needs to call virt_addr_valid() as well.
>
> To avoid recursion, kmsan_get_metadata() must not call instrumented
> code, therefore ./arch/x86/include/asm/kmsan.h forks parts of
> arch/x86/mm/physaddr.c to check whether a virtual address is valid or
> not.
>
> But the introduction of rcu_read_lock() to pfn_valid() added
> instrumented RCU API calls to virt_to_page_or_null(), which is called by
> kmsan_get_metadata(), so there is an infinite recursion now. I do not
> think it is correct to stop that recursion by doing
> kmsan_enter_runtime()/kmsan_exit_runtime() in kmsan_get_metadata(): that
> would prevent instrumented functions called from within the runtime from
> tracking the shadow values, which might introduce false positives."
>
> Fix the issue by switching pfn_valid() to the _sched() variant of
> rcu_read_lock/unlock(), which does not require calling into RCU. Given
> the critical section in pfn_valid() is very small, this is a reasonable
> trade-off (with preemptible RCU).
>
> KMSAN further needs to be careful to suppress calls into the scheduler,
> which would be another source of recursion. This can be done by wrapping
> the call to pfn_valid() into preempt_disable/enable_no_resched(). The
> downside is that this sacrifices breaking scheduling guarantees;
> however, a kernel compiled with KMSAN has already given up any
> performance guarantees due to being heavily instrumented.
>
> Note, KMSAN code already disables tracing via Makefile, and since
> mmzone.h is included, it is not necessary to use the notrace variant,
> which is generally preferred in all other cases.
>
> Link: https://lkml.kernel.org/r/20240115184430.2710652-1-glider@xxxxxxxxxx [1]
> Reported-by: Alexander Potapenko <glider@xxxxxxxxxx>
> Reported-by: syzbot+93a9e8a3dea8d6085e12@xxxxxxxxxxxxxxxxxxxxxxxxx
> Signed-off-by: Marco Elver <elver@xxxxxxxxxx>
> Cc: Charan Teja Kalla <quic_charante@xxxxxxxxxxx>

This might want a:

Fixes: 5ec8e8ea8b77 ("mm/sparsemem: fix race in accessing
memory_section->usage")

For reference which patch introduced the problem.

> ---
> arch/x86/include/asm/kmsan.h | 17 ++++++++++++++++-
> include/linux/mmzone.h | 6 +++---
> 2 files changed, 19 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/include/asm/kmsan.h b/arch/x86/include/asm/kmsan.h
> index 8fa6ac0e2d76..d91b37f5b4bb 100644
> --- a/arch/x86/include/asm/kmsan.h
> +++ b/arch/x86/include/asm/kmsan.h
> @@ -64,6 +64,7 @@ static inline bool kmsan_virt_addr_valid(void *addr)
> {
> unsigned long x = (unsigned long)addr;
> unsigned long y = x - __START_KERNEL_map;
> + bool ret;
>
> /* use the carry flag to determine if x was < __START_KERNEL_map */
> if (unlikely(x > y)) {
> @@ -79,7 +80,21 @@ static inline bool kmsan_virt_addr_valid(void *addr)
> return false;
> }
>
> - return pfn_valid(x >> PAGE_SHIFT);
> + /*
> + * pfn_valid() relies on RCU, and may call into the scheduler on exiting
> + * the critical section. However, this would result in recursion with
> + * KMSAN. Therefore, disable preemption here, and re-enable preemption
> + * below while suppressing reschedules to avoid recursion.
> + *
> + * Note, this sacrifices occasionally breaking scheduling guarantees.
> + * Although, a kernel compiled with KMSAN has already given up on any
> + * performance guarantees due to being heavily instrumented.
> + */
> + preempt_disable();
> + ret = pfn_valid(x >> PAGE_SHIFT);
> + preempt_enable_no_resched();
> +
> + return ret;
> }
>
> #endif /* !MODULE */
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 4ed33b127821..a497f189d988 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -2013,9 +2013,9 @@ static inline int pfn_valid(unsigned long pfn)
> if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
> return 0;
> ms = __pfn_to_section(pfn);
> - rcu_read_lock();
> + rcu_read_lock_sched();
> if (!valid_section(ms)) {
> - rcu_read_unlock();
> + rcu_read_unlock_sched();
> return 0;
> }
> /*
> @@ -2023,7 +2023,7 @@ static inline int pfn_valid(unsigned long pfn)
> * the entire section-sized span.
> */
> ret = early_section(ms) || pfn_section_valid(ms, pfn);
> - rcu_read_unlock();
> + rcu_read_unlock_sched();
>
> return ret;
> }
> --
> 2.43.0.381.gb435a96ce8-goog
>