Re: The INN/mmap bug

From: Alexander Viro (viro@math.psu.edu)
Date: Tue Sep 19 2000 - 09:11:51 EST


On Tue, 19 Sep 2000, Daniel Phillips wrote:

> The more I think about it the less clear and ambiguous I find it.
> When you add the dirty bit into the pot you get:
>
> Mapped, Uptodate, Dirty: not possible

Sure, it is possible - that's how the write happens

> !Mapped, Uptodate, Dirty: not possible
> Mapped, !Uptodate, Dirty: pending write

s/pending write/obvious bug/, damnit. Daniel, just think for a second: we
have the buffer read to be picked by bdflush and written to disk, while
the _contents_ _is_ _not_ _uptodate_. Just what can you expect when write
succeeds? Junk on disk, right? You know, GIGO queue - garbage in, garbage
out...

> !Mapped, !Uptodate, Dirty: pending map and write

Wrong. It's an instant BUG at line 711 in ll_rw_blk.c - remember these
reports?

> Mapped, Uptodate, !Dirty: regular block
> !Mapped, Uptodate, !Dirty: hole of zeroes
> Mapped, !Uptodate, !Dirty: unread
> !Mapped, !Uptodate, !Dirty: pending map
 
> Those two 'not possible' states bother me, they show that the three
> bits are not orthogonal. We can make arbitrary assignments of meaning
> to them as you did with !Mapped, !Uptodate (pending map) but in that
> case why not just enumerate all the states we need and lose the bit
> assignments? I think the deep problem here is trying to look at this
> as a bit-flipping problem instead of a state-transition problem.

Sure, the dirty bit is not orthogonal to the rest. You don't need to do
any complex analysis - it's as simple as
        * if I don't know _where_ to write the data - I'ld better not feed
the request to ll_rw_block(), or it may get PO'd
        * if I know that data is junk - I don't want it hitting the disk.

Dirty bit == request fed into the funnel and can be on disk any moment now.
Locked == already in IO subsystem.

That's it - completely independent from the rest, except that you don't
want the whole write mechanism applied to non-uptodate or non-mapped
pieces.

Think about IO as a memory bus - cache controller deals with writing the
cache line to RAM, but you don't want it to try that on lines that don't
have PA already calculated or have invalid contents. "mark dirty" == point
the write-behind mechanism to it and let it decide when the thing must be
written. Think how to make the cache indexed by virtual address (not by
physical, as in case of x86) work correctly. That's what pagecache is -
software MMU with cache-by-VA architecture.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Sep 23 2000 - 21:00:20 EST