Re: [PATCH 09/10] percpu: replace area map allocator with bitmap allocator

From: Dennis Zhou
Date: Wed Jul 19 2017 - 18:13:36 EST


Hi Josef,

Thanks for taking a look at my code.

On Wed, Jul 19, 2017 at 07:16:35PM +0000, Josef Bacik wrote:
>
> Actually I decided I do want to complain about this. Have you considered making
> chunks statically sized, like slab does? We could avoid this whole bound_map
> thing completely and save quite a few cycles trying to figure out how big our
> allocation was. Thanks,

I did consider something along the lines of a slab allocator, but
ultimately utilization and fragmentation were why I decided against it.

Percpu memory is handled by giving each cpu its own copy of the object
to use. This means cpus can avoid cache coherence when accessing and
manipulating the object. To do this, the percpu allocator creates chunks
to serve each allocation out of. Because each cpu has its own copy, there
is a high cost for having each chunk lying around (and this memory in
general).

With slab allocation, it takes liberty in caching often used sizes and
accepting internal fragmentation for performance. Unfortunately, the
percpu memory allocator does not necessarily know what is going to get
allocated. It would need to keep many slabs around to serve each
allocation which can be quite expensive. In the worst-case, long living
percpu allocations can keep entire slabs alive as there is no way to
perform consolidation once addresses are given out. Additionally, any
internal fragmentation caused by ill-fit objects is amplified by the
number of possible cpus.

Thanks,
Dennis