Re: Lockups due to "locking/rwsem: Make handoff bit handling more consistent"

From: Waiman Long
Date: Fri Jun 17 2022 - 10:29:31 EST


On 6/17/22 09:43, Mel Gorman wrote:
Hi Waiman,

I've received reports of lockups happening in kernels including
commit d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more
consistent"). The exact symptoms vary but usually it's either a soft lockup
(older kernel with a backport), the task hanging and never exiting or the
machine becomes generally unresponsive and ssh is broken. The problem
started in 5.16 and reliably bisected to commit d257cc8cb8d5. Reverting
the patch in 5.16, 5.17 and 5.18 finish the test successfully but I didn't
test a revert on 5.19-rc2 because of other changes layered on top.

The reproducer is simple -- start pairs of CPU hogs pinned to a CPU with
different SCHED_RR priorities that run for a few seconds. It does not
hit every time but usually happens within 10 attempts. On 5.16 at least,
the tasks failed to exit and kept retrying to exit using the following path

[<0>] rwsem_down_write_slowpath+0x2ad/0x580
[<0>] unlink_file_vma+0x2c/0x50
[<0>] free_pgtables+0xbe/0x110
[<0>] exit_mmap+0xc1/0x220
[<0>] mmput+0x52/0x110
[<0>] do_exit+0x2ec/0xb00
[<0>] do_group_exit+0x2d/0x90
[<0>] get_signal+0xb6/0x920
[<0>] arch_do_signal_or_restart+0xba/0x700
[<0>] exit_to_user_mode_prepare+0xb7/0x230
[<0>] irqentry_exit_to_user_mode+0x5/0x20
[<0>] asm_sysvec_apic_timer_interrupt+0x12/0x20
[<0>] preempt_schedule_thunk+0x16/0x18
[<0>] rwsem_down_write_slowpath+0x2ad/0x580
[<0>] unlink_file_vma+0x2c/0x50
[<0>] free_pgtables+0xbe/0x110
[<0>] exit_mmap+0xc1/0x220
[<0>] mmput+0x52/0x110
[<0>] do_exit+0x2ec/0xb00
[<0>] do_group_exit+0x2d/0x90
[<0>] get_signal+0xb6/0x920
[<0>] arch_do_signal_or_restart+0xba/0x700
[<0>] exit_to_user_mode_prepare+0xb7/0x230
[<0>] irqentry_exit_to_user_mode+0x5/0x20
[<0>] asm_sysvec_apic_timer_interrupt+0x12/0x20

The C file and shell script to run it are attached.

Thanks for the reproducer and I will try to reproduce it locally.

It is a known issue that I have receive similar report from an Oracle engineer. That is the reason I posted commit 1ee326196c66 ("locking/rwsem: Always try to wake waiters in out_nolock path") that was merged in v5.19. I believe it helps but it may not be able to eliminate all possible race conditions. To make rwsem behave more like before commit d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more consistent"), I posted a follow-up patch

https://lore.kernel.org/lkml/20220427173124.1428050-1-longman@xxxxxxxxxx/

But it hasn't gotten review yet.

I will try your reproducer to see if these patches are able to address the lockup problem.

Thanks,
Longman

commit d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more
consistent").