Re: large page patch (fwd) (fwd)

From: Andrew Morton (
Date: Sun Aug 04 2002 - 14:23:16 EST

Linus Torvalds wrote:
> On Sun, 4 Aug 2002, Hubertus Franke wrote:
> >
> > As of the page coloring !
> > Can we tweak the buddy allocator to give us this additional functionality?
> I would really prefer to avoid this, and get "95% coloring" by just doing
> read-ahead with higher-order allocations instead of the current "loop
> allocation of one block".
> I bet that you will get _practically_ perfect coloring with just two small
> changes:
> - do_anonymous_page() looks to see if the page tables are empty around
> the faulting address (and check vma ranges too, of course), and
> optimistically does a non-blocking order-X allocation.
> If the order-X allocation fails, we're likely low on memory (this is
> _especially_ true since the very fact that we do lots of order-X
> allocations will probably actually help keep fragementation down
> normally), and we just allocate one page (with a regular GFP_USER this
> time).
> Map in all pages.

This would be a problem for short-lived processes. Because "map in
all pages" also means "zero them out". And I think that performing
a 4k clear_user_highpage() immediately before returning to userspace
is optimal. It's essentialy a cache preload for userspace.

If we instead clear out 4 or 8 pages, we trash a ton of cache and
the chances of userspace _using_ pages 1-7 in the short-term are
lower. We could clear the pages with 7,6,5,4,3,2,1,0 ordering,
but the cache implications of faultahead are still there.

Could we establish the eight pte's but still arrange for pages 1-7
to trap, so the kernel can zero the out at the latest possible time?

> - do the same for page_cache_readahead() (this, btw, is where radix trees
> will kick some serious ass - we'd have had a hard time doing the "is
> this range of order-X pages populated" efficiently with the old hashes.

On the nopage path, yes. That memory is cache-cold anyway.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at

This archive was generated by hypermail 2b29 : Wed Aug 07 2002 - 22:00:25 EST