Re: [PATCH] kernel: Introduce a write lock/unlock wrapper for tasklist_lock

From: Aiqun Yu (Maria)
Date: Tue Dec 26 2023 - 20:41:58 EST




On 12/26/2023 6:46 PM, Hillf Danton wrote:
On Wed, 13 Dec 2023 12:27:05 -0600 Eric W. Biederman <ebiederm@xxxxxxxxxxxx>
Matthew Wilcox <willy@xxxxxxxxxxxxx> writes:
On Wed, Dec 13, 2023 at 06:17:45PM +0800, Maria Yu wrote:
+static inline void write_lock_tasklist_lock(void)
+{
+ while (1) {
+ local_irq_disable();
+ if (write_trylock(&tasklist_lock))
+ break;
+ local_irq_enable();
+ cpu_relax();

This is a bad implementation though. You don't set the _QW_WAITING flag
so readers don't know that there's a pending writer. Also, I've seen
cpu_relax() pessimise CPU behaviour; putting it into a low-power mode
that takes a while to wake up from.

I think the right way to fix this is to pass a boolean flag to
queued_write_lock_slowpath() to let it know whether it can re-enable
interrupts while checking whether _QW_WAITING is set.

lock(&lock->wait_lock)
enable irq
int
lock(&lock->wait_lock)

You are adding chance for recursive locking.

Thx for the comments for discuss of the deadlock possibility. While I think deadlock can be differentiate with below 2 scenarios:
1. queued_write_lock_slowpath being triggered in interrupt context.
tasklist_lock don't have write_lock_irq(save) in interrupt context.
while for common rw lock, maybe write_lock_irq(save) usage in interrupt context is a possible.
so may introduce a state when lock->wait_lock is released and left the _QW_WAITING flag.
Welcome others to suggest on designs and comments.

2.queued_read_lock_slowpath can be triggered in interrupt context. And it already have the handle to avoid possible deadlock.
In the queued_read_lock_slowpath, there is check whether current context is in interrupt or not, and get the lock directly of only write lock waiting.

Pls reference[1]:
/*
* Readers come here when they cannot get the lock without waiting
*/
if (unlikely(in_interrupt())) {
/*
* Readers in interrupt context will get the lock immediately
* if the writer is just waiting (not holding the lock yet),
* so spin with ACQUIRE semantics until the lock is available
* without waiting in the queue.
*/
atomic_cond_read_acquire(&lock->cnts, !(VAL & _QW_LOCKED));
return;
}

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/locking/qrwlock.c

Yes. It seems to make sense to distinguish between write_lock_irq and
write_lock_irqsave and fix this for all of write_lock_irq.

Either that or someone can put in the work to start making the
tasklist_lock go away.

Eric

--
Thx and BRs,
Aiqun(Maria) Yu