Re: [PATCH v3] sched/debug: Use sched_debug_lock to serialize use of cgroup_path[] only

From: Waiman Long
Date: Sun Apr 04 2021 - 21:27:28 EST


On 4/4/21 12:02 PM, Steven Rostedt wrote:
On Fri, 2 Apr 2021 23:09:09 -0400
Waiman Long <longman@xxxxxxxxxx> wrote:

The main problem with sched_debug_lock is that under certain
circumstances, a lock waiter may wait a long time to acquire the lock
(in seconds). We can't insert touch_nmi_watchdog() while the cpu is
waiting for the spinlock.
The problem I have with the patch is that it seems to be a hack (as it
doesn't fix the issue in all cases). Since sched_debug_lock is
"special", perhaps we can add wrappers to take it, and instead of doing
the spin_lock_irqsave(), do a trylock loop. Add lockdep annotation to
tell lockdep that this is not a try lock (so that it can still detect
deadlocks).

Then have the strategically placed touch_nmi_watchdog() also increment
a counter. Then in that trylock loop, if it sees the counter get
incremented, it knows that forward progress is being made by the lock
holder, and it too can call touch_nmi_watchdog().

Thanks for the suggestion, but it also sound complicated.

I think we can fix this lockup problem if we are willing to lose some information in case of contention. As you have suggested, a trylock will be used to acquire sched_debug_lock. If succeeded, all is good. Otherwise, a shorter stack buffer will be used for cgroup path. The path may be truncated in this case. If we detect that the full length of the buffer is used, we assume truncation and add, e.g. "...", to indicate there is more to the actual path.

Do you think this is an acceptable comprise?

Cheers,
Longman