Re: [PATCH] mutex: Report recursive ww_mutex locking early

From: Maarten Lankhorst
Date: Mon May 30 2016 - 05:43:38 EST


Op 30-05-16 om 11:11 schreef Peter Zijlstra:
> On Mon, May 30, 2016 at 09:43:53AM +0200, Maarten Lankhorst wrote:
>> Op 26-05-16 om 22:08 schreef Chris Wilson:
>>> Recursive locking for ww_mutexes was originally conceived as an
>>> exception. However, it is heavily used by the DRM atomic modesetting
>>> code. Currently, the recursive deadlock is checked after we have queued
>>> up for a busy-spin and as we never release the lock, we spin until
>>> kicked, whereupon the deadlock is discovered and reported.
>>>
>>> A simple solution for the now common problem is to move the recursive
>>> deadlock discovery to the first action when taking the ww_mutex.
>>>
>>> Testcase: igt/kms_cursor_legacy
> I've no idea what this tag is or where to find the actual testcase, so
> I've killed it.
https://cgit.freedesktop.org/xorg/app/intel-gpu-tools/

tests/kms_cursor_legacy tries to do as many updates as possible with SCHED_RR..

Patch not applied, SCHED_RR:
# ./kms_cursor_legacy
IGT-Version: 1.14-g9579e5447aa3 (x86_64) (Linux: 4.6.0-patser+ x86_64)
[3] count=86
[2] count=91
[1] count=78
[0] count=104
Subtest stress-bo: SUCCESS (22,372s)

Patch not applied, SCHED_NORMAL:
# ./kms_cursor_legacy
IGT-Version: 1.14-g9579e5447aa3 (x86_64) (Linux: 4.6.0-patser+ x86_64)
[2] count=4713
[0] count=4288
[3] count=4776
[1] count=4521
Subtest stress-bo: SUCCESS (21,492s)

Patch applied, NORMAL + RR give roughly same results:
# nfs/intel-gpu-tools/tests/kms_cursor_legacy
IGT-Version: 1.14-g9579e5447aa3 (x86_64) (Linux: 4.6.0-patser+ x86_64)
[0] count=77631
[1] count=77740
[3] count=77612
[2] count=77666
Subtest stress-bo: SUCCESS (21,487s)

>>> Suggested-by: Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxxxx>
>>> Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
>>> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>>> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
>>> Cc: Christian König <christian.koenig@xxxxxxx>
>>> Cc: Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxxxx>
>>> Cc: linux-kernel@xxxxxxxxxxxxxxx
>>> ---
>>>
>>> Maarten suggested this as a simpler fix to the immediate problem. Imo,
>>> we still want to perform deadlock detection within the spin in order to
>>> catch more complicated deadlocks without osq_lock() forcing fairness!
>> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxxxx>
>>
>> Should this be Cc: stable@xxxxxxxxxxxxxxx ?
> Can do; how far back?
>