Re: CONFIG_PROVE_RAW_LOCK_NESTING false positive?

From: David Woodhouse
Date: Thu Nov 23 2023 - 10:15:32 EST


On 23 November 2023 15:13:45 GMT, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>On Thu, Nov 23, 2023 at 03:05:15PM +0000, David Woodhouse wrote:
>> On 23 November 2023 15:01:19 GMT, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>> >On Thu, Nov 23, 2023 at 09:00:41AM +0000, David Woodhouse wrote:
>> >> Is this telling me that I'm no longer allowed to take a read_lock() in
>> >> a callback for an HRTIMER_MODE_ABS_HARD timer? Is that intentional?
>> >>
>> >> If I must, I can probably cope with this by using read_trylock()
>> >> instead. The object being locked is a cache, and we opportunistically
>> >> try to use it from the fast path but fall back to a slow path in
>> >> process context which will revalidate and try again. So if someone
>> >> *has* taken the write lock, it's a fairly safe bet that the cache is
>> >> going to be invalidated and we were going to take the slow path anyway.
>> >>
>> >> [ 62.336965] =============================
>> >> [ 62.336973] [ BUG: Invalid wait context ]
>> >> [ 62.336992] 6.7.0-rc1+ #1437 Tainted: G I
>> >> [ 62.337001] -----------------------------
>> >> [ 62.337008] qemu-system-x86/1935 is trying to lock:
>> >> [ 62.337017] ffffc900018fecc0 (&gpc->lock){....}-{3:3}, at: kvm_xen_set_evtchn_fast+0xe7/0x460 [kvm]
>> >> [ 62.337133] other info that might help us debug this:
>> >> [ 62.337142] context-{2:2}
>> >> [ 62.337148] 2 locks held by qemu-system-x86/1935:
>> >> [ 62.337156] #0: ffff888108f780b0 (&vcpu->mutex){+.+.}-{4:4}, at: kvm_vcpu_ioctl+0x7f/0x730 [kvm]
>> >> [ 62.337239] #1: ffffc900018ff2d8 (&kvm->srcu){.?.+}-{0:0}, at: kvm_xen_set_evtchn_fast+0xcd/0x460 [kvm]
>> >> [ 62.337339] stack backtrace:
>> >> [ 62.337346] CPU: 7 PID: 1935 Comm: qemu-system-x86 Tainted: G I 6.7.0-rc1+ #1437
>> >> [ 62.337370] Hardware name: Intel Corporation S2600CW/S2600CW, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
>> >> [ 62.337384] Call Trace:
>> >> [ 62.337390] <IRQ>
>> >> [ 62.337395] dump_stack_lvl+0x57/0x90
>> >> [ 62.337407] __lock_acquire+0x7bb/0xbb0
>> >> [ 62.337416] ? __lock_acquire+0x4f0/0xbb0
>> >> [ 62.337425] lock_acquire.part.0+0xad/0x240
>> >> [ 62.337433] ? kvm_xen_set_evtchn_fast+0xe7/0x460 [kvm]
>> >> [ 62.337512] ? rcu_is_watching+0xd/0x40
>> >> [ 62.337520] ? lock_acquire+0xf2/0x110
>> >> [ 62.337529] __raw_read_lock_irqsave+0x4e/0xa0
>> >> [ 62.337538] ? kvm_xen_set_evtchn_fast+0xe7/0x460 [kvm]
>> >> [ 62.337604] kvm_xen_set_evtchn_fast+0xe7/0x460 [kvm]
>> >> [ 62.337669] ? kvm_xen_set_evtchn_fast+0xcd/0x460 [kvm]
>> >> [ 62.337734] xen_timer_callback+0x86/0xc0 [kvm]
>> >
>> >xen_timer_callback is HRTIMER_MODE_ABS_HARD, which means it will still
>> >run in IRQ context for PREEMPT_RT.
>> >
>> >OTOH read_lock_irqsave() is not a raw spinlock and will be turned into a
>> >blocking lock.
>> >
>> >This then gives scheduling from IRQ context, which is somewhat frowned
>> >upon.
>> >
>> >Warning is real and valid.
>>
>>
>> ... or at least will be when PREEMPT_RT turns the read_lock into a mutex?
>
>Right, this check specifically validates the RT lock nesting rules.
>
>> But there is no raw version of read_lock(). Can we have one please?
>
>Should be possible, but is somewhat non-trivial, it is very easy to
>create significant latencies with RW locks. Definitely not something I'm
>going to be able to do in a hurry.
>
>Also, I suspect Thomas is going to strongly suggest not going down that
>road and looking if this can be solved differently.

That's the "If I must…" paragraph above. I'll hack it up. Thanks.