Re: [PATCH 00/10] foundations for reserve-based allocation

From: Daniel Phillips
Date: Mon Aug 06 2007 - 13:35:40 EST


On Monday 06 August 2007 03:29, Peter Zijlstra wrote:
> In the interrest of getting swap over network working and posting in
> smaller series, here is the first series.
>
> This series lays the foundations needed to do reserve based
> allocation. Traditionally we have used mempools (and others like
> radix_tree_preload) to handle the problem.
>
> However this does not fit the network stack. It is built around
> variable sized allocations using kmalloc().
>
> This calls for a different approach.
>
> We want a guarantee for N bytes from kmalloc(), this translates to a
> demand on the slab allocator for 2*N+m (due to the power-of-two
> nature of kmalloc slabs), where m is the meta-data needed by the
> allocator itself.

Where does the 2* come from? Isn't it exp2(ceil(log2(N + m)))?

> The slab allocator then puts a demand of P pages on the page
> allocator.
>
> So we need functions translating our demanded kmalloc space into a
> page reserve limit, and then need to provide a reserve of pages.
>
> And we need to ensure that once we hit the reserve, the slab
> allocator honours the reserve's access. That is, a regular allocation
> may not get objects from a slab allocated from the reserves.

Patch [3/10] adds a new field to struct page. I do not think this is
necessary. Allocating a page from reserve does not make it special.
All we care about is that the total number of pages taken out of
reserve is balanced by the total pages freed by a user of the reserve.

We do care about slab fragmentation in the sense that a slab page may be
pinned in the slab by an unprivileged allocation and so that page may
never be returned to the global page reserve. One way to solve this is
to have a per slabpage flag indicating the page came from reserve, and
prevent mixing of privileged and unprivileged allocations on such a
page.

> There is already a page reserve, but it does not fully comply with
> our needs. For example, it does not guarantee a strict level (due to
> the relative nature of ALLOC_HIGH and ALLOC_HARDER). Hence we augment
> this reserve with a strict limit.
>
> Furthermore a new __GFP flag is added to allow easy access to the
> reserves along-side the existing PF_MEMALLOC.
>
> Users of this infrastructure will need to do the necessary bean
> counting to ensure they stay within the requested limits.

This patch set is _way_ less intimidating than its predecessor.
However, I see we have entered the era of sets of patch sets, since it
is impossible to understand the need for this allocation infrastructure
without reading the dependent network patch set. Waiting with
breathless anticipation.

Regards,

Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/