Re: [PATCH v2 05/11] sched,livepatch: Use wake_up_if_idle()

From: Vasily Gorbik
Date: Thu Oct 07 2021 - 05:19:07 EST


On Wed, Oct 06, 2021 at 11:16:21AM +0200, Miroslav Benes wrote:
> On Wed, 29 Sep 2021, Peter Zijlstra wrote:
>
> > Make sure to prod idle CPUs so they call klp_update_patch_state().
> >
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> > ---
> > kernel/livepatch/transition.c | 5 ++++-
> > 1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > --- a/kernel/livepatch/transition.c
> > +++ b/kernel/livepatch/transition.c
> > @@ -413,8 +413,11 @@ void klp_try_complete_transition(void)
> > for_each_possible_cpu(cpu) {
> > task = idle_task(cpu);
> > if (cpu_online(cpu)) {
> > - if (!klp_try_switch_task(task))
> > + if (!klp_try_switch_task(task)) {
> > complete = false;
> > + /* Make idle task go through the main loop. */
> > + wake_up_if_idle(cpu);
> > + }
>
> Right, it should be enough.
>
> Acked-by: Miroslav Benes <mbenes@xxxxxxx>
>
> It would be nice to get Vasily's Tested-by tag on this one.

I gave patches a spin on s390 with livepatch kselftest as well as with
https://github.com/lpechacek/qa_test_klp.git

BTW, commit 43c79fbad385 ("klp_tc_17: Avoid running the test on
s390x") is no longer required, since s390 implements HAVE_KPROBES_ON_FTRACE
since v5.6, so I just reverted test disablement.

Patches 1-6 work nicely, for them

Acked-by: Vasily Gorbik <gor@xxxxxxxxxxxxx>
Tested-by: Vasily Gorbik <gor@xxxxxxxxxxxxx> # on s390

Thanks a lot!

Starting with patch 8 is where I start seeing this with my config:

Oct 07 10:46:00 kernel: Freeing unused kernel image (initmem) memory: 6524K
Oct 07 10:46:00 kernel: INFO: task swapper/0:1 blocked for more than 122 seconds.
Oct 07 10:46:00 kernel: Not tainted 5.15.0-rc4-69810-ga714851e1aad-dirty #74
Oct 07 10:46:00 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 07 10:46:00 kernel: task:swapper/0 state:D stack:10648 pid: 1 ppid: 0 flags:0x00000000
Oct 07 10:46:00 kernel: Call Trace:
Oct 07 10:46:00 kernel: [<0000000000e164b6>] __schedule+0x36e/0x8b0
Oct 07 10:46:00 kernel: [<0000000000e16a4e>] schedule+0x56/0x128
Oct 07 10:46:00 kernel: [<0000000000e1e426>] schedule_timeout+0x106/0x160
Oct 07 10:46:00 kernel: [<0000000000e18316>] wait_for_completion+0xc6/0x118
Oct 07 10:46:00 kernel: [<000000000020a15c>] rcu_barrier.part.0+0x17c/0x2c0
Oct 07 10:46:00 kernel: [<0000000000e0fcc0>] kernel_init+0x60/0x168
Oct 07 10:46:00 kernel: [<000000000010390c>] __ret_from_fork+0x3c/0x58
Oct 07 10:46:00 kernel: [<0000000000e2094a>] ret_from_fork+0xa/0x30
Oct 07 10:46:00 kernel: 1 lock held by swapper/0/1:
Oct 07 10:46:00 kernel: #0: 0000000001469600 (rcu_state.barrier_mutex){+.+.}-{3:3}, at: rcu_barrier+0x42/0x80
Oct 07 10:46:00 kernel:
Showing all locks held in the system:
Oct 07 10:46:00 kernel: 1 lock held by swapper/0/1:
Oct 07 10:46:00 kernel: #0: 0000000001469600 (rcu_state.barrier_mutex){+.+.}-{3:3}, at: rcu_barrier+0x42/0x80
Oct 07 10:46:00 kernel: 2 locks held by kworker/u680:0/8:
Oct 07 10:46:00 kernel: #0: 000000008013cd48 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x222/0x738
Oct 07 10:46:00 kernel: #1: 0000038000043dc8 ((kfence_timer).work){+.+.}-{0:0}, at: process_one_work+0x222/0x738
Oct 07 10:46:00 kernel: 1 lock held by khungtaskd/413:
Oct 07 10:46:00 kernel: #0: 000000000145c980 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire.constprop.0+0x0/0x50
Oct 07 10:46:00 kernel:
Oct 07 10:46:00 kernel: =============================================

So, will keep an eye on the rest of these patches and re-test in future, thanks!