kdump always hangs in rcu_barrier() -> wait_for_completion()

From: Dexuan Cui
Date: Tue Nov 24 2020 - 23:57:52 EST


Hi,
I happened to hit a kdump hang issue in a Linux VM running on some
Hyper-V host. Please see the attached log: the kdump kernel always hangs,
even if I configure only 1 virtual CPU to the VM.

I firstly hit the issue in RHEL 8.3's 4.18.x kernel, but later I found that
the latest upstream v5.10-rc5 also has the same issue (at least the
symptom is exactly the same), so I dug into v5.10-rc5 and found that the
kdump kernel always hangs in kernel_init() -> mark_readonly() ->
rcu_barrier() -> wait_for_completion(&rcu_state.barrier_completion).

Let's take the 1-vCPU case for example (refer to the attached log): in the
below code, somehow rcu_segcblist_n_cbs() returns true, so the call
smp_call_function_single(cpu, rcu_barrier_func, (void *)cpu, 1) increases
the counter by 1, and hence later the counter is still 1 after the
atomic_sub_and_test(), and the complete() is not called.

static void rcu_barrier_func(void *cpu_in)
{
...
if (rcu_segcblist_entrain(&rdp->cblist, &rdp->barrier_head)) {
atomic_inc(&rcu_state.barrier_cpu_count);
} else {
...
}

void rcu_barrier(void)
{
...
atomic_set(&rcu_state.barrier_cpu_count, 2);
...
for_each_possible_cpu(cpu) {
rdp = per_cpu_ptr(&rcu_data, cpu);
...
if (rcu_segcblist_n_cbs(&rdp->cblist) && cpu_online(cpu)) {
...
smp_call_function_single(cpu, rcu_barrier_func, (void *)cpu, 1);
...
}
}
...
if (atomic_sub_and_test(2, &rcu_state.barrier_cpu_count))
complete(&rcu_state.barrier_completion);

...
wait_for_completion(&rcu_state.barrier_completion);

Sorry for my ignorance of RCU -- I'm not sure why the rcu_segcblist_n_cbs()
returns 1 here. In the normal kernel, it returns 0, so the normal kernel does not
hang.

Note: in the case of kdump kernel, if I remove the kernel parameter
console=ttyS0 OR if I build the kernel with CONFIG_HZ=250, the issue can
no longer reproduce. Currently my kernel uses CONFIG_HZ=1000 and I use
console=ttyS0, so I'm able to reproduce the isue every time.

Note: the same kernel binary can not reproduce the issue when the VM
runs on another Hyper-V host.

It looks there is some kind of race condition?

Looking forward to your insights!

I'm happy to test any patch or enable more tracing, if necessary. Thanks!

Thanks,
-- Dexuan

Attachment: bad-hz-1000.log
Description: bad-hz-1000.log