Re: [PATCH] counter: drop chrdev_lock

From: David Lechner
Date: Mon Oct 18 2021 - 12:14:19 EST


On 10/18/21 4:51 AM, William Breathitt Gray wrote:
On Mon, Oct 18, 2021 at 11:13:16AM +0200, Greg KH wrote:
On Mon, Oct 18, 2021 at 05:58:37PM +0900, William Breathitt Gray wrote:
On Mon, Oct 18, 2021 at 08:08:21AM +0200, Greg KH wrote:
On Sun, Oct 17, 2021 at 01:55:21PM -0500, David Lechner wrote:
This removes the chrdev_lock from the counter subsystem. This was
intended to prevent opening the chrdev more than once. However, this
doesn't work in practice since userspace can duplicate file descriptors
and pass file descriptors to other processes. Since this protection
can't be relied on, it is best to just remove it.

Much better, thanks!

One remaining question:

--- a/include/linux/counter.h
+++ b/include/linux/counter.h
@@ -297,7 +297,6 @@ struct counter_ops {
* @events: queue of detected Counter events
* @events_wait: wait queue to allow blocking reads of Counter events
* @events_lock: lock to protect Counter events queue read operations
- * @chrdev_lock: lock to limit chrdev to a single open at a time
* @ops_exist_lock: lock to prevent use during removal

Why do you still need 2 locks for the same structure?

thanks,

greg k-h

Originally there was only the events_lock mutex. Initially I tried using
it to also limit the chrdev to a single open, but then came across a
"lock held when returning to user space" warning:
https://lore.kernel.org/linux-arm-kernel/YOq19zTsOzKA8v7c@shinobu/T/#m6072133d418d598a5f368bb942c945e46cfab9a5

Instead of losing the benefits of a mutex lock for protecting the
events, I ultimately implemented the chrdev_lock separately as an
atomic_t. If the chrdev_lock is removed, then we'll use events_lock
solely from now on for this structure.

chrdev_lock should be removed, it doesn't really do what you think it
does, as per the thread yesterday :)

So does this mean you can also drop the ops_exist_lock?

thanks,

greg k-h

When counter_unregister is called, the ops member is set to NULL to
indicate that the driver will be removed and that no new device
operations should occur (because the ops callbacks will no longer be
valid). The ops_exist_lock is used to allow existing ops callback
dereferences to complete before the driver is removed so that we do not
suffer a page fault.

I don't believe we can remove this protection (or can we?) but perhaps
we can merge the three mutex locks (n_events_list_lock, events_lock, and
ops_exist_lock) into a single "counter_lock" that handles all mutex
locking for this structure.


The different mutexes protect individual parts of the counter struct
rather than the struct as a whole (a linked list, kfifo reads, and
callback ops), so I think it makes the code clearer having individual
mutexes for each rather than having a global mutex for unrelated
actions.