Re: bio_chain: proposed solution for bio_alloc failure and large IO simplification

From: Andrew Morton (
Date: Fri Jun 14 2002 - 18:38:32 EST

William Lee Irwin III wrote:
> On Fri, Jun 14, 2002 at 04:00:52PM -0700, Andrew Morton wrote:
> > Everything is pretty much in place to do this now. The main piece
> > which is missing is the gang page allocator (Hi, Bill).
> > It'll be damn fast, and nicely scalable. It's all about reducing the
> > L1 cache footprint. Making best use of data when it is in cache.
> > Making best use of locks once they have been acquired. If it is
> > done right, it'll be almost as fast as 64k PAGE_CACHE_SIZE, with
> > none of its disadvantages.
> > In this context, bio_chain() is regression, because we're back
> > into doing stuff once-per-page, and longer per-page call graphs.
> > I'd rather not have to do it if it can be avoided.
> gang_cpu is not quite ready to post, but work is happening on it
> and it's happening today -- I have a suitable target in hand and
> am preparing it for testing. The bits written thus far consist of
> a transparent per-cpu pool layer refilled using the gang transfer
> mechanism, and I'm in the process of refining that to non-prototypical
> code and extending it with appropriate deadlock avoidance so explicit
> gang allocation requests can be satisfied.

Great, thanks.

Performing gang allocation within generic_file_write may not
be practical, especially if the application is being good and
is issuing 8k writes. So there will still be pressure on the
single-page allocator.

Certainly, reads can perform gang allocation.

Which tends to point us in the direction of using the lockless
per-cpu page allocation for writes, and explicit gang allocation
for reads. So possibly, gang allocation should go straight to
the main page list and not drain the per-cpu pools. Leave them
reserved for the single-page allocators - write(2) and anon pages.

But it's early days yet...

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at

This archive was generated by hypermail 2b29 : Sat Jun 15 2002 - 22:00:32 EST