Re: Commit 282d8998e997 (srcu: Prevent expedited GPs and blocking readers from consuming CPU) cause qemu boot slow

From: Paolo Bonzini
Date: Sun Jun 12 2022 - 15:24:02 EST


On 6/12/22 20:49, Paul E. McKenney wrote:

1) kvm->irq_srcu is hardly relying on the "sleepable" part; it has readers
that are very very small, but it needs extremely fast detection of grace
periods; see commit 719d93cd5f5c ("kvm/irqchip: Speed up
KVM_SET_GSI_ROUTING", 2014-05-05) which split it off kvm->srcu. Readers are
not so frequent.

2) kvm->srcu is nastier because there are readers all the time. The
read-side critical section are still short-ish, but they need the sleepable
part because they access user memory.

Which one of these two is in play in this case?

The latter, kvm->srcu; though at boot time both are hammered on quite a bit (and then essentially not at all).

For the one involved it's still pretty rare for readers to sleep, but it cannot be excluded. Most critical sections are short, I'd guess in the thousands of clock cycles but I can add some instrumentation tomorrow (or anyway before Tuesday).

The problem was not internal to SRCU, but rather due to the fact
that kernel live patching (KLP) had problems with the CPU-bound tasks
resulting from repeated synchronize_rcu_expedited() invocations.

I see. Perhaps only add to the back-to-back counter if the synchronize_srcu_expedited() takes longer than a jiffy? This would indirectly check if syncronize_srcu_expedited() readers are actually blocking. KVM uses syncronize_srcu_expedited() because it expects it to take very little (again I'll get hard numbers asap).

Paolo