Re: The INN/mmap bug

From: Alexander Viro (viro@math.psu.edu)
Date: Mon Sep 18 2000 - 15:51:19 EST


On Mon, 18 Sep 2000, Linus Torvalds wrote:

> No, but that's just because you _think_ about the problem the wrong way
> around.
>
> If you think about it like the above, it's not a "stage" at all. It's just
> part of getting the buffer to be up-to-date.

Umm... OK. Let me put it that way:
        * uptodate pages should never become non-uptodate.
        * we do multiple read requests on the same buffer.
        * the only thing saving us from data loss is that non-uptodate
pages never lose buffers when they have data newer than on disk (uptodate
pages do, but they never have data _older_ than on disk). Thus we can
afford rereading a buffer on non-uptodate page if bh was lost and never
have to reread it on uptodate page.
        * look at the explanation above and see if it looks brittle. It
really needs to be documented - deducing it from the code takes some time
and code responsible for that is spread over many functions. If you have
shorter/simpler version - I'll be glad to see it. Especially if it doesn't
start with "we'll consider the cases of uptodate and non-uptodate pages
separately".

        That's what makes me unhappy about the current situation + obvious
fixes. It works, but the proof is... well, not pretty. One of simpler
variants would be
        * every buffer is read at most once (bitmap in struct page takes
care of that).
        * no buffer is read at all once the page becomes uptodate.
        * page doesn't become mmapable until it gets uptodate.
but if you have simple explanation why the current scheme (+ obvious
patch) works correctly - I'll happily shut up. Just put it in mm.h or
buffer.c, OK?

> Very logical, very simple, no "fscking mess" at all.
>
> That's my argument. You're trying to make it a problem. It's not. It's
> something new with the page cache, yes - the fact that the page cache
> drives the logic means that the buffer "uptodate" logic is different, but
> if you think about it some, you'll realize that that's actually just a
> natural outgrowth of the fact that the buffer cache is slaved to the page
> cache these days.
>
> It's not a "stage 2". It's a basic fact of life - we don't care _how_ the
> page got to be up-to-date, because for all we know there are other ways to
> get it to be up-to-date than your "stage 1".
>
> For example, a recvfile() implementation could mark the page up-to-date by
> virtue of getting the information off the network. No "stage 1" or "stage
> 2" there at all: the make_buffer_uptodate() logic does _not_ mean that we
> lost the buffers from "stage 1", it might equally well mean that we got
> the page up-to-date through some other means.
>
> Just change how you think about the problem, and suddenly it's not a mess,
> it's a design.

That's fine, but I would still like to see a simple proof of correctness
that would not fork into these two branches (page uptodate and not
uptodate resp.) in the very beginning.

        That's where "stage 1" and "stage 2" came from (word 'stage' was
used simply because I've spent some time trying to convince myself that
generic_file_write() was OK until I've realized that SetPageUptodate()
_must_ be irrevertible and there is no way around that)

        Linus, I believe that it works. No problem with that. I'm
just uncomfortable with the general look of the proof. That's it. I would
certainly prefer to avoid rereading the blocks at all, but that's nowhere
near a serious performance problem. My only beef with the current code is
in the look of mechanism responsible for avoiding the data loss on
read requests.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Sep 23 2000 - 21:00:18 EST