Re: Linux 6.6-rc3 (DEBUG_VIRTUAL is unhappy on x86)

From: Sumit Garg
Date: Tue Oct 03 2023 - 08:06:39 EST


Hi Linus,

On 10/2/23 02:18, Linus Torvalds wrote:
On Sun, 1 Oct 2023 at 07:17, Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx> wrote:
Peter Zijlstra (1):
x86,static_call: Fix static-call vs return-thunk
Hello, the commit above caused a crash on x86 kernel with
CONFIG_DEBUG_VIRTUAL=y.
OK, I looked into this a little bit, and it turns out that the problematic
address here is from cleanup_trusted() in
security/keys/trusted-keys/trusted_core.c.
(and it's builtin due to CONFIG_TRUSTED_KEYS=y)

The function is marked as __exit, so it does not fall within the
'core kernel text address range,' which is between _stext and _etext
(or between _sinittext and _einittext). and thus __text_poke() thinks that
it's vmalloc/module area.

I think __text_poke() should be taught that functions marked as __exit
also belong to kernel code just like __init.
I think your patch is fine (well, whitespace-damaged, but conceptually good).

But I also wonder about that

static_call_cond(trusted_key_exit)();

in cleanup_trusted(). It seems all kinds of pointless to use static
calls for something that is done *once*. That's not an optimization,
that's honestly just _stupid_. It costs more to do the rewriting that
it does to just do the one dynamic indirect call.

That's true, there isn't any real performance benefit here. It is something which I mentioned when I was asked to incorporate it here [1]. However, on the flip side I think there are security benefits here. We wouldn't like any indirect branch speculation attack to leak the trusted key material contents here.

[1] https://patchwork.kernel.org/project/keyrings/patch/1602065268-26017-2-git-send-email-sumit.garg@xxxxxxxxxx/#23683269

-Sumit


Side note: the same is true of the init-time call, which does

static_call_update(trusted_key_init,
trusted_key_sources[i].ops->init);
...
ret = static_call(trusted_key_init)();

which again is a *lot* more expensive than just doing the indirect
function call.

So while I don't think your patch is wrong, I do think that the cause
here is plain silly code, and that trusted key code simply should not
do the crazy thing it does (and that causes silly problems).

Linus