RE: [PATCH RFC 0/2] hyperv: Use raw_spinlock_t when not sleepable

From: Michael Kelley (LINUX)
Date: Tue Aug 15 2023 - 14:46:35 EST


From: Mitchell Levy via B4 Relay <devnull+levymitchell0.gmail.com@xxxxxxxxxx> Sent: Wednesday, August 9, 2023 3:37 PM
>
> When compiled with PREEMPT_RT, spinlock_t is sleepable. While I did not
> observe this causing any lockups on my system, it did cause warnings to
> be emitted when compiled with lock debugging. This series fixes some
> instances where spinlock_t is used in non-sleepable contexts, though it
> almost certainly does not find every such instance.
>
> An example of the warning raised by lockdep:
> =============================
> [BUG: Invalid wait context ]
> 6.5.0-rc1+ #16 Not tainted
> -----------------------------
> swapper/0/1 is trying to lock:
> ffffa05a014d64c0 (&channel→sched_lock) {...}-{3:3}, at: vmbus_isr+0x179/0x320
> other info that might help us debug this:
> context-{2:2}
> 2 locks held by swapper/0/1:
> #0: ffffffff909a9c70 (misc_mtx){+.+.}-{4:4}, at: misc_register+0x34/0x180
> #1: ffffffff9079b4c8 (rcu_read_lock) {...}-{1:3}, at: rcu_lock_acquire+0x0/0x40
> stack backtrace:
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc1+ #16
> Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V
> UEFI Release v4.1 05/09/2022
>
> Signed-off-by: Mitchell Levy <levymitchell0@xxxxxxxxx>

Getting the Hyper-V specific Linux guest code being able to work correctly
with PREEMPT_RT is an interesting requirement that we obviously haven't
dealt with yet. Unfortunately, converting the channel->sched_lock to
a raw spin lock doesn't fully solve the problem. Once that conversion is
done, there are multiple places where kmalloc() or a relative could be
called with GFP_ATOMIC while holding the sched_lock. While such
allocations are allowed in a normal kernel, they are not allowed in a
PREEMPT_RT kernel.

One such place is in hv_compose_msi_msg(), where
hv_pci_onchannelcallback() is invoked while holding the sched_lock.
hv_pci_onchannelcallback() does memory allocations.

Another place is in vmbus_chan_sched() where the onchannel_callback
function is directly invoked while holding the lock if HV_CALL_ISR is set.
HV_CALL_ISR is set for the uio_hv_generic.c driver, and for the netvsc driver.
I didn't follow all the code paths in the netvsc driver, but I suspect there
are places where the netvsc driver callback function does memory
allocations. I did look at the hv_uio_generic.c driver, and I'm pretty
sure a memory allocation could be done by uio_event_notify() when
sending a signal to the user space process.

Unfortunately, none of these places have easy fixes for the memory
allocation requirement. Getting the Hyper-V specific code to work
with PREEMPT_RT will likely require some non-trivial redesign.

Given these limitations, I'm not sure if making this change is
worthwhile. I'm not 100% clear on your goals. If it is simply to
eliminate the lockdep warnings, then perhaps there's an
argument to be made in favor of the changes. I'm open to
further discussion on the topic.

Michael

> ---
> Mitchell Levy (2):
> Use raw_spinlock_t for vmbus_channel.sched_lock
> Use raw_spinlock_t in vmbus_requestor
>
> drivers/hv/channel.c | 6 +++---
> drivers/hv/channel_mgmt.c | 2 +-
> drivers/hv/vmbus_drv.c | 4 ++--
> drivers/pci/controller/pci-hyperv.c | 6 +++---
> include/linux/hyperv.h | 8 ++++----
> 5 files changed, 13 insertions(+), 13 deletions(-)
> ---
> base-commit: 14f9643dc90adea074a0ffb7a17d337eafc6a5cc
> change-id: 20230807-b4-rt_preempt-fix-35a65c90c6c9
>
> Best regards,
> --
> Mitchell Levy <levymitchell0@xxxxxxxxx>