Re: [PATCH 0/5] make slab gfp fair

From: Peter Zijlstra
Date: Mon May 21 2007 - 16:00:37 EST


On Mon, 2007-05-21 at 09:45 -0700, Christoph Lameter wrote:
> On Sun, 20 May 2007, Peter Zijlstra wrote:
>
> > I care about kernel allocations only. In particular about those that
> > have PF_MEMALLOC semantics.
>
> Hmmmm.. I wish I was more familiar with PF_MEMALLOC. ccing Nick.
>
> > - set page->reserve nonzero for each page allocated with
> > ALLOC_NO_WATERMARKS; which by the previous point implies that all
> > available zones are below ALLOC_MIN|ALLOC_HIGH|ALLOC_HARDER
>
> Ok that adds a new field to the page struct. I suggested a page flag in
> slub before.

No it doesn't; it overloads page->index. Its just used as extra return
value, it need not be persistent. Definitely not worth a page-flag.

> > - when a page->reserve slab is allocated store it in s->reserve_slab
> > and do not update the ->cpu_slab[] (this forces subsequent allocs to
> > retry the allocation).
>
> Right that should work.
>
> > All ALLOC_NO_WATERMARKS enabled slab allocations are served from
> > ->reserve_slab, up until the point where a !page->reserve slab alloc
> > succeeds, at which point the ->reserve_slab is pushed into the partial
> > lists and ->reserve_slab set to NULL.
>
> So the original issue is still not fixed. A slab alloc may succeed without
> watermarks if that particular allocation is restricted to a different set
> of nodes. Then the reserve slab is dropped despite the memory scarcity on
> another set of nodes?

I can't see how. This extra ALLOC_MIN|ALLOC_HIGH|ALLOC_HARDER alloc will
first deplete all other zones. Once that starts failing no node should
still have pages accessible by any allocation context other than
PF_MEMALLOC.

> > Since only the allocation of a new slab uses the gfp zone flags, and
> > other allocations placement hints they have to be uniform over all slab
> > allocs for a given kmem_cache. Thus the s->reserve_slab/page->reserve
> > status is kmem_cache wide.
>
> No the gfp zone flags are not uniform and placement of page allocator
> allocs through SLUB do not always have the same allocation constraints.

It has to; since it can serve the allocation from a pre-existing slab
allocation. Hence any page allocation must be valid for all other users.

> SLUB will check the node of the page that was allocated when the page
> allocator returns and put the page into that nodes slab list. This varies
> depending on the allocation context.

Yes, it keeps slabs on per node lists. I'm just not seeing how this puts
hard constraints on the allocations.

As far as I can see there cannot be a hard constraint here, because
allocations form interrupt context are at best node local. And node
affine zone lists still have all zones, just ordered on locality.

> Allocations can be particular to uses of a slab in particular situations.
> A kmalloc cache can be used to allocate from various sets of nodes in
> different circumstances. kmalloc will allow serving a limited number of
> objects from the wrong nodes for performance reasons but the next
> allocation from the page allocator (or from the partial lists) will occur
> using the current set of allowed nodes in order to ensure a rough
> obedience to the memory policies and cpusets. kmalloc_node behaves
> differently and will enforce using memory from a particular node.

>From what I can see, it takes pretty much any page it can get once you
hit it with PF_MEMALLOC. If the page allocation doesn't use ALLOC_CPUSET
the page can come from pretty much anywhere.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/