Re: [PATCH] rxrpc_find_service_conn_rcu: use read_seqbegin() rather than read_seqbegin_or_lock()

From: David Howells
Date: Wed Nov 01 2023 - 17:23:33 EST


Oleg Nesterov <oleg@xxxxxxxxxx> wrote:

> > 1 0 need_seqretry() [seq=even; sequence!=seq: retry]
>
> Yes, if CPU_1 races with write_seqlock() need_seqretry() returns true,
>
> > 1 1 read_seqbegin_or_lock() [exclusive]
>
> No. "seq" is still even, so read_seqbegin_or_lock() won't do read_seqlock_excl(),
> it will do

Yeah, you're right. I missed the fact that I got in the second example that
read_seqbegin_or_lock() spins until it sees a positive seq.

However, I think just changing all of these to always-lockless isn't
necessarily the most optimal way. Yes, it will work... eventually. But the
point is to limit the number of iterations.

So I have the following:

(1) rxrpc_find_service_conn_rcu().

I want to move the first part of the reaper to the I/O thread at some
point, then the locking here can go away entirely. However, this is
drivable by external events, so I would prefer to limit the number of
passes to just two and take a lock on the second pass. Holding up the
reaper thread for the moment is fine; holding up the I/O thread is not.

(2) afs_lookup_volume_rcu().

There can be a lot of volumes known by a system. A thousand would
require a 10-step walk and this is drivable by remote operation, so I
think this should probably take a lock on the second pass too.

(3) afs_check_validity().
(4) afs_get_attr().

These are both pretty short, so your solution is probably good for them.
That said, afs_vnode_commit_status() can spend a long time under the
write lock - and pretty much every file RPC op returns a status update.

(5) afs_find_server().

There could be a lot of servers in the list and each server can have
multiple addresses, so I think this would be better with an exclusive
second pass.

The server list isn't likely to change all that often, but when it does
change, there's a good chance several servers are going to be
added/removed one after the other. Further, this is only going to be
used for incoming cache management/callback requests from the server,
which hopefully aren't going to happen too often - but it is remotely
drivable.

(6) afs_find_server_by_uuid().

Similarly to (5), there could be a lot of servers to search through, but
they are in a tree not a flat list, so it should be faster to process.
Again, it's not likely to change that often and, again, when it does
change it's likely to involve multiple changes. This can be driven
remotely by an incoming cache management request but is mostly going to
be driven by setting up or reconfiguring a volume's server list -
something that also isn't likely to happen often.

I wonder if struct seqlock would make more sense with an rwlock rather than a
spinlock. As it is, it does an exclusive spinlock for the readpath which is
kind of overkill.

David