Re: arm64: perf test 26 rpi4 oops

From: Will Deacon
Date: Mon Jul 31 2023 - 07:52:18 EST


On Mon, Jul 31, 2023 at 11:43:40AM +0100, Will Deacon wrote:
> [+Lorenzo, Kefeng and others]
>
> On Sun, Jul 30, 2023 at 06:09:15PM +0200, Mike Galbraith wrote:
> > On Fri, 2023-07-28 at 15:18 +0100, Will Deacon wrote:
> > >
> > > Looking at this quickly with Mark, the most likely explanation is that
> > > a bogus kernel address is being passed as the source pointer to
> > > copy_to_user().
> >
> > 'start' in read_kcore_iter() is bogus a LOT when running perf test 26,
> > and that back to at least 5.15. Seems removal of bogon-proofing gave a
> > toothless old bug teeth, but seemingly only to perf test 26. Rummaging
> > around with crash vmlinux /proc/kcore seems to be bogon free anyway.
> >
> > Someone should perhaps take a peek at perf. Bogons aside, it also
> > doesn't seem to care deeply about kernel response. Whether the kernel
> > oops or I bat 945 bogons aside, it says 'OK'. That seems a tad odd.
>
> Aha, so I think I triggered the issue you're seeing under QEMU (log
> below). perf (unhelpfully) doesn't have stable test numbers, so it's
> test 21 in my case. However, it only explodes if I run it as root, since
> /proc/kcore is 0400 on my system.
>
> The easiest way to trigger the problem is simply:
>
> # objdump -d /proc/kcore
>
> Looking at the history, I wonder whether this is because of a combination
> of:
>
> e025ab842ec3 ("mm: remove kern_addr_valid() completely")
>
> which removed the kern_addr_valid() check on the basis that kcore used
> copy_from_kernel_nofault() anyway, and:
>
> 2e1c0170771e ("fs/proc/kcore: avoid bounce buffer for ktext data")
>
> which replaced the copy_from_kernel_nofault() with _copy_to_user().
>
> So with both of those applied, we're missing the address check on arm64.

Digging into this a little more, the fault occurs because kcore is
treating everything from '_text' to '_end' as KCORE_TEXT and expects it
to be mapped linearly. However, there's plenty of stuff we _don't_ map
in that range on arm64 (e.g. .head.text, the pKVM hypervisor, the entry
trampoline) so kcore is broken.

One hack is to limit KCORE_TEXT to actually point at the kernel text
(see below), but this is a user-visible change in behaviour for things
like .data so I think it would be better to restore the old behaviour
of handling the faults.

Lorenzo?

Will

--->8

diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
index 9cb32e1a78a0..3696a209c1ec 100644
--- a/fs/proc/kcore.c
+++ b/fs/proc/kcore.c
@@ -635,7 +635,7 @@ static struct kcore_list kcore_text;
*/
static void __init proc_kcore_text_init(void)
{
- kclist_add(&kcore_text, _text, _end - _text, KCORE_TEXT);
+ kclist_add(&kcore_text, _stext, _etext - _stext, KCORE_TEXT);
}
#else
static void __init proc_kcore_text_init(void)