Re: [Kiobuf-io-devel] RFC: Kernel mechanism: Compound event wait /notify + callback chains

From: Christoph Hellwig (
Date: Thu Feb 01 2001 - 12:49:50 EST

On Thu, Feb 01, 2001 at 05:34:49PM +0000, Alan Cox wrote:
> > > I'm in the middle of some parts of it, and am actively soliciting
> > > feedback on what cleanups are required.
> >
> > The real issue is that Linus dislikes the current kiobuf scheme.
> > I do not like everything he proposes, but lots of things makes sense.
> Linus basically designed the original kiobuf scheme of course so I guess
> he's allowed to dislike it. Linus disliking something however doesn't mean
> its wrong. Its not a technically valid basis for argument.

Sure. But Linus saing that he doesn't want more of that (shit, crap,
I don't rember what he said exactly) in the kernel is a very good reason
for thinking a little more aboyt it.

Espescially if most arguments look right to one after thinking more about

> Linus list of reasons like the amount of state are more interesting

True. The arument that they are to heaviweight also.
That they should allow scatter gather without an array of structs also.

> > > So, what are the benefits in the disk IO stack of adding length/offset
> > > pairs to each page of the kiobuf?
> >
> > I don't see any real advantage for disk IO. The real advantage is that
> > we can have a generic structure that is also usefull in e.g. networking
> > and can lead to a unified IO buffering scheme (a little like IO-Lite).
> Networking wants something lighter rather than heavier.

Right. That's what the new design was about, besides adding a offset and
length to every page instead of the page array, something also wanted by
the networking in the first place.
Look at the skb_frag struct in the zero-copy patch for what networking
thinks it needs for physical page based buffers.

> Adding tons of base/limit pairs to kiobufs makes it worse not better

>From looking at the networking code and listening to Dave and Ingo it looks
like it makes the thing better for networking, although I can not verify
this due to the lack of familarity with the networking code.

For disk I/O it makes the handling a little easier for the cost of the
additional offset/length fields.


P.S. the tuple things is also what Larry had in his inital slice paper.

Of course it doesn't work. We've performed a software upgrade.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
Please read the FAQ at

This archive was generated by hypermail 2b29 : Wed Feb 07 2001 - 21:00:13 EST