Re: [PATCH v4 1/2] x86/setup: always add the beginning of RAM as memblock.memory

From: Mike Rapoport
Date: Mon Feb 01 2021 - 09:07:26 EST


On Sun, Jan 31, 2021 at 01:49:27PM -0800, Linus Torvalds wrote:
> On Sun, Jan 31, 2021 at 12:04 AM Mike Rapoport <rppt@xxxxxxxxxx> wrote:
> >
> > >
> > > That's *particularly* true when the very line above it did a
> > > "memblock_reserve()" of the exact same range that the memblock_add()
> > > "adds".
> >
> > The most correct thing to do would have been to
> >
> > memblock_add(0, end_of_first_memory_bank);
> >
> > Somewhere at e820__memblock_setup().
>
> You miss my complaint.
>
> Why does the memblock code care about this magical "memblock_add()",
> when we just told it that the SAME REGION is reserved by doing a
> "memblock_reserve()"?
>
> IOW, I'm not interested in "the correct thing to do would have been
> [another memblock_add()]". I'm saying that the memblock code itself is
> being confused, and no additional thing should have been required at
> all, because we already *did* that memblock_reserve().
>
> See?

There is nothing magical about memblock_add().

Memblock presumes that arch code uses memblock_add() to register populated
physical memory ranges and memblock_reserve() to protect memory ranges that
should not be touched. These ranges do not necessarily overlap, so there
maybe reserved ranges that do not have the corresponding registered memory.

This lets architectures to say "here are the memory banks I have" and "this
memory is in use" (or even "this memory _might_ be in use" ) independently
of each other.

The downside is that if there is a reserved range there is no way to tell
whether it is backed by populated memory.

We could change this semantics and enforce the overlap, e.g. by
implicitly adding all the reserved ranges to the registered memory.
I've already tried that and I've found out that there are systems that rely
on memblock's ability to track reserved and available ranges independently.
For example, arm systems I've mentioned in the previous mail always have a
reserved chunk at 0xfe000000 in their DTS, but they may have only 2G of
memory actually populated.

Now, on x86 there is a gap between e820 and memblock since 2.6 times. As of
now, only E820_TYPE_RAM is added to memblock as memory, some of the
E820_*_RESERVED are reserved and on top there are reservations of the
memory that's known to be used by BIOS or kernel.

I'm trying to close this gap with small steps and with changes that I
believe will not break too many things at once so it'll become
unmanageable.

> Honestly, I'm not seeing it being a good thing to move further towards
> memblock code as the primary model for memory initialization, when the
> memblock code is so confused.

I'm not sure I follow you here.
If I'm not mistaken, memblock is used as the primary model for memmap and
page allocator initialization for almost a decade now...

> Linus

--
Sincerely yours,
Mike.