Re: [PATCH v3 0/9] klp-convert livepatch build tooling

From: Miroslav Benes
Date: Tue Apr 16 2019 - 07:37:20 EST



[...]

> Current behavior
> ----------------
>
> Not good. The livepatch successfully builds but crashes on load:
>
> % insmod lib/livepatch/test_klp_static_keys_mod.ko
> % insmod lib/livepatch/test_klp_static_keys.ko
>
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
> #PF error: [normal kernel read fault]
> PGD 0 P4D 0
> Oops: 0000 [#1] SMP PTI
> CPU: 3 PID: 9367 Comm: insmod Tainted: G E K 5.1.0-rc4+ #4
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014
> RIP: 0010:jump_label_apply_nops+0x3b/0x60
> Code: 02 00 00 48 c1 e5 04 48 01 dd 48 39 eb 74 3a 72 0b eb 36 48 83 c3 10 48 39 dd 76 2d 48 8b 43 08 48 89 c2 83 e0 01 48 83 e2 fc <48> 8b 54 13 10 83 e2 01 38 c2 75 dd 48 89 df 31 f6 48 83 c3 10 e8
> RSP: 0018:ffffa8874068fcf8 EFLAGS: 00010206
> RAX: 0000000000000000 RBX: ffffffffc07fd000 RCX: 000000000000000d
> RDX: 000000003f803000 RSI: ffffffffa5077be0 RDI: ffffffffc07fe540
> RBP: ffffffffc07fd0a0 R08: ffffa88740f43878 R09: ffffa88740eed000
> R10: 0000000000055a4b R11: ffffa88740f43878 R12: ffffa88740f430b8
> R13: 0000000000000000 R14: ffffa88740f42df8 R15: 0000000000042b01
> FS: 00007f4f1dafb740(0000) GS:ffff9a81fbb80000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000010 CR3: 00000000b5d8a000 CR4: 00000000000006e0
> Call Trace:
> module_finalize+0x184/0x1c0
> load_module+0x1400/0x1910
> ? kernel_read_file+0x18d/0x1c0
> ? __do_sys_finit_module+0xa8/0x110
> __do_sys_finit_module+0xa8/0x110
> do_syscall_64+0x55/0x1a0
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f4f1cae82bd
>
>
> Future work
> -----------
>
> At the very least, I think this call-chain ordering is wrong for
> livepatch static key symbols:
>
> load_module
>
> apply_relocations
>
> post_relocation
> module_finalize
> jump_label_apply_nops <<
>
> ...
>
> prepare_coming_module
> blocking_notifier_call_chain(&module_notify_list, MODULE_STATE_COMING, mod);
> jump_label_module_notify
> case MODULE_STATE_COMING
> jump_label_add_module <<
>
> do_init_module
>
> do_one_initcall(mod->init)
> __init patch_init [kpatch-patch]
> klp_register_patch
> klp_init_patch
> klp_for_each_object(patch, obj)
> klp_init_object
> klp_init_object_loaded
> klp_write_object_relocations <<
>
> blocking_notifier_call_chain(&module_notify_list, MODULE_STATE_LIVE, mod);
> jump_label_module_notify
> case MODULE_STATE_LIVE
> jump_label_invalidate_module_init
>
> where klp_write_object_relocations() is called way *after*
> jump_label_apply_nops() and jump_label_add_module().

Quick look, but it seems quite similar to the problem we had with
apply_alternatives(). See arch/x86/kernel/livepatch.c and the commit which
introduced it.

I think, we should do the same for jump labels. Add
jump_label_apply_nops() from module_finalize() to
arch_klp_init_object_loaded() and convert jump_table ELF section so its
processing is delayed.

Which leads me another TODO... klp-convert does not convert even
.altinstructions and .parainstructions sections, so it has that problem as
well. If I remember, it was on Josh's TODO list when he first introduced
klp-convert. See cover.1477578530.git.jpoimboe@xxxxxxxxxxx

The selftest for the alternatives would be appreciated too. One day.

And of course we should look at the other supported architectures and
their module_finalize() functions. I have it on my TODO list somewhere,
but you know how it works with those :/. I am sure there are more hidden
surprises there.


> Detection
> ---------
>
> I have been tinkering with some prototype code to defer
> jump_label_apply_nops() and jump_label_add_module(), but it has been
> slow going. I think the jist of it is that we're going to need to call
> these dynamically when individual klp_objects are patched, not when the
> livepatch module itself loads. If anyone with static key expertise
> wants to jump in here, let me know.
>
> In the meantime, I cooked up a potential followup commit to detect
> conversion of static key symbols and klp-convert failure. It basically
> runs through the output .ko's ELF symbols and verifies that none of the
> converted ones can be found as a .rela__jump_table relocated symbol. It
> accurately catches the problematic references in test_klp_static_keys.ko
> thus far.
>
> This was based on a similar issue reported as a bug against
> kpatch-build, in which Josh wrote code to detect this scenario:
>
> https://github.com/dynup/kpatch/issues/946
> https://github.com/jpoimboe/kpatch/commit/2cd2d27607566aee9590c367e615207ce1ce24c6
>
> I can post ("livepatch/klp-convert: abort on static key conversion")
> here as a follow commit if it looks reasonable and folks wish to review
> it... or we can try and tackle static keys before merging klp-convert.

Good idea. I'd rather fix it, but I think it could be a lot of work, so
something like this patch seems to be a good idea.

Thanks
Miroslav