Re: [rfc][patch] remove racy sync_page?

From: Linus Torvalds
Date: Wed May 31 2006 - 12:11:40 EST




On Wed, 31 May 2006, Linus Torvalds wrote:
>
> In other words, we actually want to unplug at lock_page(), BUT ONLY IF IT
> NEEDS OT WAIT, which is by no means all of them. So it's more than just
> "add an unplug at lock_page()" it's literally "add an unplug at any
> lock-page that blocks".

Btw, don't get me wrong. In the read case, every time we do a lock_page(),
we might as well unplug unconditionally, because we effectively know we're
going to wait (we just checked that it's not up-to-date). Sure, there's a
race window, but we don't care - the window is very small, and in the end,
unplugging isn't a correctness issue as long as you do it at least as
often as required (ie unplugging too much is ok and at worst just makes
for bad performance - so a very unlikely race that causes _extra_
unplugging is fine as long as it's unlikely. forgetting to unplug is bad).

In the write case, lock_page() may or may not need an unplug. In those
cases, it needs unplugging if it was locked before, but obviously not if
it wasn't.

In the "random case" where we use the page lock not because we want to
start IO on it, but because we need to make sure that page->mapping
doesn't change, we don't really care about the IO, but we do need to
unplug just to make sure that the IO will complete.

And I suspect your objection to unplugging is not really about unplugging
itself. It's literally about the fact that we use the same page lock for
IO and for the ->mapping thing, isn't it?

IOW, you don't actually dislike plugging itself, you dislike it due to the
effects of a totally unrelated locking issue, namely that we use the same
lock for two totally independent things. If the ->mapping thing were to
use a PG_map_lock that didn't affect plugging one way or the other, you
wouldn't have any issues with unplugging, would you?

And I think _that_ is what really gets us to the problem.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/