Re: [PATCH v3 31/35] lib: add memory allocations report in show_mem()

From: Kent Overstreet
Date: Thu Feb 15 2024 - 18:53:16 EST


On Thu, Feb 15, 2024 at 06:07:42PM -0500, Steven Rostedt wrote:
> On Thu, 15 Feb 2024 15:33:30 -0500
> Kent Overstreet <kent.overstreet@xxxxxxxxx> wrote:
>
> > > Well, I think without __GFP_NOWARN it will cause a warning and thus
> > > recursion into __show_mem(), potentially infinite? Which is of course
> > > trivial to fix, but I'd myself rather sacrifice a bit of memory to get
> > > this potentially very useful output, if I enabled the profiling. The
> > > necessary memory overhead of page_ext and slabobj_ext makes the
> > > printing buffer overhead negligible in comparison?
> >
> > __GFP_NOWARN is a good point, we should have that.
> >
> > But - and correct me if I'm wrong here - doesn't an OOM kick in well
> > before GFP_ATOMIC 4k allocations are failing? I'd expect the system to
> > be well and truly hosed at that point.
> >
> > If we want this report to be 100% reliable, then yes the preallocated
> > buffer makes sense - but I don't think 100% makes sense here; I think we
> > can accept ~99% and give back that 4k.
>
> I just compiled v6.8-rc4 vanilla (with a fedora localmodconfig build) and
> saved it off (vmlinux.orig), then I compiled with the following:
>
> Applied the patches but did not enable anything: vmlinux.memtag-off
> Enabled MEM_ALLOC_PROFILING: vmlinux.memtag
> Enabled MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT: vmlinux.memtag-default-on
> Enabled MEM_ALLOC_PROFILING_DEBUG: vmlinux.memtag-debug
>
> And here's what I got:
>
> text data bss dec hex filename
> 29161847 18352730 5619716 53134293 32ac3d5 vmlinux.orig
> 29162286 18382638 5595140 53140064 32ada60 vmlinux.memtag-off (+5771)
> 29230868 18887662 5275652 53394182 32ebb06 vmlinux.memtag (+259889)
> 29230746 18887662 5275652 53394060 32eba8c vmlinux.memtag-default-on (+259767) dropped?
> 29276214 18946374 5177348 53399936 32ed180 vmlinux.memtag-debug (+265643)
>
> Just adding the patches increases the size by 5k. But the rest shows an
> increase of 259k, and you are worried about 4k (and possibly less?)???

Most of that is data (505024), not text (68582, or 66k).

The data is mostly the alloc tags themselves (one per allocation
callsite, and you compiled the entire kernel), so that's expected.

Of the text, a lot of that is going to be slowpath stuff - module load
and unload hooks, formatt and printing the output, other assorted bits.

Then there's Allocation and deallocating obj extensions vectors - not
slowpath but not super fast path, not every allocation.

The fastpath instruction count overhead is pretty small
- actually doing the accounting - the core of slub.c, page_alloc.c,
percpu.c
- setting/restoring the alloc tag: this is overhead we add to every
allocation callsite, so it's the most relevant - but it's just a few
instructions.

So that's the breakdown. Definitely not zero overhead, but that fixed
memory overhead (and additionally, the percpu counters) is the price we
pay for very low runtime CPU overhead.