Re: Read/write locks

Colin Plumb (colin@nyx.net)
Sun, 21 Sep 1997 02:23:37 -0600 (MDT)


Just wondering... whose words are quoted with ":"? I haven't
seen tiwr writing directly...

> Almost all of the sleeping rwlock implementations I've ever seen allow
> any reader to make upgrade attempts to become a writer, and also
> conversely they allow any writer to downgrade into a reader.

> Code sequences using these look perhaps something like:

> sleep_read_lock(&list->rwlock);
> obj = futz_around(list);
> if(obj == it) {
> sleep_upgrade_lock(&list->rwlock);
> remove(list, obj);
> sleep_write_unlock(&list->rwlock);
> return obj;
> }
> sleep_read_unlock(&list->rwlock);
> return NULL;

> or whatever... So the claims of the upgrade facility are:

This absolutely DOES NOT WORK. This is, in fact, the canonical example
of why you need the upgrader state.

Suppose two peole enter the above section simultaneously. Both futz
around and find the same object, then try to upgrade their locks.
Neither will succeed, because the other has a read lock which conflicts.
The only possibility is that sleep_upgrade_lock temporarily drops
the read lock, which allows the other writer to modify the list while
the read lock is held, which invalidates the precondition required for
remove() that futz_around() ensured.

> 3) Upgrader facility allows you to verify prerequisites for an op,
> then upgrade to write and do the operation.

> : Again, same is true in my scheme.

This is the part that is *false*. If it's just an ordinary read lock,
then you can't upgrade to a write lock without
a) Dropping the lock temporarily, invalidating the prerequisites, or
b) deadlocking

4) An upgrader conflicts with writers and other upgraders, but not
with read locks.

: When a reader expresses desire for an upgrade, disallow any
: new readers to enter, so that existing readers can pile out
: and let the upgrader in. This is sort of the heuristic used
: when a writer walks in, hold out new readers.

Um, but what do you do when one of those readers, instead of leaving,
*also* expresses a desire to upgrade?

> Adding this special "upgrader" state, and the new locking that goes
> with it, is unnecessary and in fact bloat if you think about the
> approach I am presenting.

You don't have to use it. As I showed in code, it just consists
of splitting the current write_lock() code into two parts and
possibly executing them at different points in the code. If you
do both parts back-to-back, you get the current write_lock code
with no extra bloat at all. (In fact, mine, by not letting
readers in while a writer is waiting, is slightly smaller.)

> Consider the case where multiple code streams want to become readers
> with the "intension of writing" (updaters), but all of them will find
> what they want to update, not even there, and just drop the lock and
> leave.

You can do it optimisitcally, as the Linux kernel does already,
finding the object, acquiring the lock (or disabling interrupts),
verifying that the object is still there (very important - you'll
find this in the kernel lots of places), and then doing the
operation. This is an optimistic locking strategy, and not bad,
but you do have to reverify.

> In the upgrader approach, here everyone would be stymied and walk into
> the critical section in lock-step. In my suggested scheme they'd all
> walk in in parallel and be on with it.

This is useful only if it's expected that most people would find out that
they don't want to modify. If that is the case, it may pay to try
it once without the possibility of modification, and if that's not possible,
try again with an upgrade lock.

I.e.
read_lock(obj);
okay = obj_is_okay();
read_unlock(obj);
if (!okay) {
update_lock(obj);
if (!obj_is_okay()) { /* Reverify */
upgrade_to_write_lock(obj);
make_obj_okay(obj);
}
write_unlock(obj);
}

> Another issue, if a lock is mostly used for "find and remove", "find
> and update that" type activities (and this is what upgrading can be
> used for, however implemented), then you might just want a sleeping
> spinlock in there anyways. It's cheaper and much easier to verify.

Yes, indeed, this is the lock granularity question.
(For those unfamiliar with the problem, finer-grain locks allow more
parallelism, which is faster, but they also add more lock overhead,
which is slower. Deciding when to split locks is a big issue.
As Linux 2.0 shows, you can do pretty well with just one big lock
aroudn the whole kernel if you don't have many processors.)

> However for lookup entensive objects (buffer cache, page cache,
> dcache, etc.) rwlocks are definately what you want, because these are
> the cases where you want all the parallelism. (Just from casual scans
> of header files of various commercial unixes, and TPC benchmark
> results, I will tell you that it seems that an invariant
> characteristic of those whole perform incredibly on SMP TPC runs use a
> seperate rwlock for every hash chain in their buffer cache....)

Somebody should probably add some contention profiling to the kernel
locks, seeing how ofthen there's a conflict and how long it lasts.

-- 
	-Colin