Re: [PATCH v5 1/1] mm: report per-page metadata information

From: Pasha Tatashin
Date: Thu Nov 02 2023 - 12:44:28 EST


On Thu, Nov 2, 2023 at 12:09 PM David Hildenbrand <david@xxxxxxxxxx> wrote:
>
> On 02.11.23 17:02, Pasha Tatashin wrote:
> > On Thu, Nov 2, 2023 at 11:53 AM David Hildenbrand <david@xxxxxxxxxx> wrote:
> >>
> >> On 02.11.23 16:50, Pasha Tatashin wrote:
> >>>>> Adding reserved memory to MemTotal is a cleaner approach IMO as well.
> >>>>> But it changes the semantics of MemTotal, which may have compatibility
> >>>>> issues.
> >>>>
> >>>> I object.
> >>>
> >>> Could you please elaborate what you object (and why): you object that
> >>> it will have compatibility issues, or you object to include memblock
> >>> reserves into MemTotal?
> >>
> >> Sorry, I object to changing the semantics of MemTotal. MemTotal is
> >> traditionally the memory managed by the buddy, not all memory in the
> >> system. I know people/scripts that are relying on that [although it's
> >> been source of confusion a couple of times].
> >
> > What if one day we change so that struct pages are allocated from
> > buddy allocator (i.e. allocate deferred struct pages from buddy) will
>
> It does on memory hotplug. But for things like crashkernel size
> detection doesn't really care about that.

"Crash kernel" is a different case: it is kernel external memory,
similar to limiting the amount of physical memory via mem=/memmap=, it
sets memory that cannot be used by this kernel, but only by the crash
kernel. Also, the crash kernel reserve is exposed in /proc/iomem via
"Crash kernel" range.

Page metadata memory on the other hand, is used by this kernel, and
also can be changed by this kernel depending on how the memory is
used: memdec, hotplug, THP, emulated pmem etc.

> > it break those MemTotal scripts? What if the size of struct pages
> > changes significantly, but the overhead will come from other metadata
> > (i.e. memdesc) will that break those scripts? I feel like struct page
>
> Probably; but ideally the metadata overhead will be smaller with
> memdesc. And we'll talk about that once it gets real ;)

The size and allocation of struct pages change MemTotal today, during
runtime, even without memdesc, I just brought it up, to emphasize that
this is something that we should resolve now before it gets worse.

> > memory should really be included into MemTotal, otherwise we will have
> > this struggle in the future when we try to optimize struct page
> > memory.
> How far do we want to go, do we want to include crashkernel reserved
> memory in MemTotal because it is system memory? Only metadata? what else
> allocated using memblock?
>
> Again, right now it's simple: MemTotal is memory managed by the buddy.
>
> The spirit of this patch set is good, modifying existing counters needs
> good justification.

Wei, noticed that all other fields in /proc/meminfo are part of
MemTotal, but this new field may be not (depending where struct pages
are allocated), so what would be the best way to export page metadata
without redefining MemTotal? Keep the new field in /proc/meminfo but
be ok that it is not part of MemTotal or do two counters? If we do two
counters, we will still need to keep one that is a buddy allocator in
/proc/meminfo and the other one somewhere outside?

Pasha