Re: kmalloc optimization

From: Andi Kleen (
Date: Sun Aug 27 2000 - 11:18:16 EST

On Sun, Aug 27, 2000 at 05:08:37PM +0100, Mark Hemment wrote:
> It is possible to add the necessary data capture abilities to
> kmalloc() and friends, with a build time option. This would allow users
> to gather the stats for their own work loads.

surprise ;)

> My latest version of the Slab allocator (still haven't got round to
> finishing the thing off), allows new sizes to be added dynamically without
> needing to have a lock guarding the linkages between the different general
> cache size nodes (gnodes).

I implemented it in the past and tried it for networking (tuning slab sizes
to MTUs, some remnants are still skbuff.c). I just used a stop the world
scheme, so no locking was needed. It unfortunately performed badly because
of the large page orders needed so I dropped it.

> The standard (power-of-2) general caches' gnodes are from contigious
> memory, so they can be indexed into after a find-highes-set-bit operation.

It is just not done, and I think it is not worth it.

> Also, slabs can be shared between caches where the objects are of
> similar size. This allows a specific cache to also be exported as a
> general cache (still giving correct object usage counts for each sharing
> cache). The only disadvantage there is increased lock contention inside
> my slab_info structures, but heavily used caches can insure they don't
> share via a SLAB_PERFORMANCE creation flag.

Wouldn't that destroy one of the most important advantages of a zone allocator?
(clustering objects with similar livetime to avoid fragmentation)
> > Also the most heavy users are probably better converted to direct calls
> > of kmem_cache_alloc (looking at my /proc/slabinfo there must be some heavy
> > user who doesn't do that for a size <=32bytes)
> Many of the remaining, general size, allocations are not performance
> critical. Of the ones which are, networking (the data buffers for
> skbufs) are by far the most important one to tackle.

I'm worrying less about performance than about memory usage and fragmentation
(slab pages tied although they only have a few active objects left)

> > 1500bytes slab unfortunately is not too useful, because it does not fit well
> > in 4K pages (you would need 8K or 16K page allocations, which the mm system
> > does not like much due to fragmentation)
> Doesn't matter (too much).

On 32MB machines it was a big problem in my experiments.


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
Please read the FAQ at

This archive was generated by hypermail 2b29 : Thu Aug 31 2000 - 21:00:19 EST