Re: [PATCH 10/12] mm, slub: remove percpu slabs with CONFIG_SLUB_TINY

From: Hyeonggon Yoo
Date: Tue Dec 13 2022 - 09:03:02 EST


On Tue, Dec 13, 2022 at 11:04:33AM +0800, Baoquan He wrote:
> On 12/12/22 at 05:11am, Dennis Zhou wrote:
> > Hello,
> >
> > On Mon, Dec 12, 2022 at 11:54:28AM +0100, Vlastimil Babka wrote:
> > > On 11/27/22 12:05, Hyeonggon Yoo wrote:
> > > > On Mon, Nov 21, 2022 at 06:12:00PM +0100, Vlastimil Babka wrote:
> > > >> SLUB gets most of its scalability by percpu slabs. However for
> > > >> CONFIG_SLUB_TINY the goal is minimal memory overhead, not scalability.
> > > >> Thus, #ifdef out the whole kmem_cache_cpu percpu structure and
> > > >> associated code. Additionally to the slab page savings, this reduces
> > > >> percpu allocator usage, and code size.
> > > >
> > > > [+Cc Dennis]
> > >
> > > +To: Baoquan also.
>
> Thanks for adding me.
>
> > >
> > > > Wondering if we can reduce (or zero) early reservation of percpu area
> > > > when #if !defined(CONFIG_SLUB) || defined(CONFIG_SLUB_TINY)?
> > >
> > > Good point. I've sent a PR as it was [1], but (if merged) we can still
> > > improve that during RC series, if it means more memory saved thanks to less
> > > percpu usage with CONFIG_SLUB_TINY.
> > >
> > > [1]
> > > https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git/tag/?h=slab-for-6.2-rc1
> >
> > The early reservation area not used at boot is then used to serve normal
> > percpu allocations. Percpu allocates additional chunks based on a free
> > page float count and is backed page by page, not all at once. I get
> > slabs is the main motivator of early reservation, but if there are other
> > users of percpu, then shrinking the early reservation area is a bit
> > moot.
>
> Agree. Before kmem_cache_init() is done, anyone calling alloc_percpu()
> can only get allocation done from early reservatoin of percpu area.
> So, unless we can make sure nobody need to call alloc_percpu() before
> kmem_cache_init() now and future.

Thank you both for explaination.
just googled and found random /proc/meminfo output of K210 board (6MB RAM, dual-core)

Given that even K210 board uses around 100kB of percpu area,
might not be worth thing to do :(

https://gist.github.com/pdp7/0fd86d39e07ad7084f430c85a7a567f4?permalink_comment_id=3179983#gistcomment-3179983

> The only drawback of early reservation is it's not so flexible. We can
> only dynamically create chunk to increase percpu areas when early
> reservation is run out, but can't shrink early reservation if system
> doesn't need that much.
>
> So we may need weigh the two ideas:
> - Not allowing to alloc_percpu() before kmem_cache_init();
> - Keep early reservation, and think of a economic value for
> CONFIG_SLUB_TINY.
>
> start_kernel()
> ->setup_per_cpu_areas();
> ......
> ->mm_init();
> ......
> -->kmem_cache_init();
>
>
> __alloc_percpu()
> -->pcpu_alloc()
> --> succeed to allocate from early reservation
> or
> -->pcpu_create_chunk()
> -->pcpu_alloc_chunk()
> -->pcpu_mem_zalloc()
>

--
Thanks,
Hyeonggon