Re: [PATCH] mm: memblock: avoid to create memmap for memblock nomap regions

From: Dmitry Baryshkov
Date: Wed Feb 14 2024 - 03:12:04 EST


On Wed, 14 Feb 2024 at 09:44, Mike Rapoport <rppt@xxxxxxxxxx> wrote:
>
> On Thu, Feb 08, 2024 at 02:37:25PM +0800, Aiqun Yu (Maria) wrote:
> >
> > On 8/6/2022 3:22 AM, Mike Rapoport wrote:
> > > Hi Vijay,
> > >
> > > On Wed, Aug 03, 2022 at 04:27:33PM +0530, Vijayanand Jitta wrote:
> > > >
> > > > On 5/9/2022 5:12 PM, Mike Rapoport wrote:
> > > > > On Mon, May 09, 2022 at 04:37:30PM +0530, Faiyaz Mohammed wrote:
> > > > > >
> > > > > > On 5/5/2022 10:24 PM, Mike Rapoport wrote:
> > > > > > > On Thu, May 05, 2022 at 08:46:15PM +0530, Faiyaz Mohammed wrote:
> > > > > > > > On 4/12/2022 10:56 PM, Mike Rapoport wrote:
> > > > > > > > > On Tue, Apr 12, 2022 at 12:39:32AM +0530, Faiyaz Mohammed wrote:
> > > > > > > > > > This 'commit 86588296acbf ("fdt: Properly handle "no-map" field in the
> > > > > > > > > > memory region")' is keeping the no-map regions in memblock.memory with
> > > > > > > > > > MEMBLOCK_NOMAP flag set to use no-map memory for EFI using memblock api's,
> > > > > > > > > > but during the initialization sparse_init mark all memblock.memory as
> > > > > > > > > > present using for_each_mem_pfn_range, which is creating the memmap for
> > > > > > > > > > no-map memblock regions. To avoid it skiping the memblock.memory regions
> > > > > > > > > > set with MEMBLOCK_NOMAP set and with this change we will be able to save
> > > > > > > > > > ~11MB memory for ~612MB carve out.
> > > > > > > > > The MEMBLOCK_NOMAP is very fragile and caused a lot of issues already. I
> > > > > > > > > really don't like the idea if adding more implicit assumptions about how
> > > > > > > > > NOMAP memory may or may not be used in a generic iterator function.
> > > > > > > > Sorry for delayed response.
> > > > > > > > Yes, it is possible that implicit assumption can create
> > > > > > > > misunderstanding. How about adding command line option and control the
> > > > > > > > no-map region in fdt.c driver, to decide whether to keep "no-map" region
> > > > > > > > with NOMAP flag or remove?. Something like below
> > > > > > > I really don't like memblock_remove() for such cases.
> > > > > > > Pretending there is a hole when there is an actual DRAM makes things really
> > > > > > > hairy when it comes to memory map and page allocator initialization.
> > > > > > > You wouldn't want to trade system stability and random memory corruptions
> > > > > > > for 11M of "saved" memory.
> > > > > >
> > > > > > Creating memory map for holes memory is adding 11MB overhead which is
> > > > > > huge on low memory target and same time 11MB memory saving is good enough
> > > > > > on low memory target.
> > > > > >
> > > > > > Or we can have separate list of NOMAP like reserved?.
> > > > > >
> > > > > > Any other suggestion to address this issue?.
> > > > >
> > > > > Make your firmware to report the memory that Linux cannot use as a hole,
> > > > > i.e. _not_ report it as memory.
> > > >
> > > > Thanks, Mike for the comments.
> > > >
> > > > Few concerns with this approach.
> > > >
> > > > 1) One concern is, even if firmware doesn't report these regions as
> > > > memory, we would need addresses for these to be part of device tree so
> > > > that the clients would be able to get these addresses. Otherwise there
> > > > is no way for client to know these addresses.
> > > >
> > > > 2) This would also add a dependency on firmware to be able to pass these
> > > > regions not as memory, though we know that these regions would be used
> > > > by the clients. Isn't it better to have such control within the kernel ?
> > >
> > > If it is memory that is used by the kernel it should be reported as memory
> > > and have the memory map.
> > > If this is a hole in the memory layout from the kernel perspective, then
> > > kernel should not bother with this memory.
> > Hi Mike,
> >
> > We've put effort on bootloader side to implement the similar suggestion of
> > os bootloader to convey the reserved memory by omit the hole from
> > /memory@0{reg=[]} directly.
> > While there is a concern from device tree spec perspective, link [1]: "A
> > memory device node is required for all devicetrees and describes the
> > physical memory layout for the system. "
> > Do you have any idea on this pls?
>
> I'm not sure I understand your concern. Isn't there a /memory node that
> describes the memory available to Linux in your devicetree?

That was the question. It looks like your opinion on /memory was that
it describes "memory available to Linux", while device tree spec
defines it as "physical memory layout".

>
> > [1] https://github.com/devicetree-org/devicetree-specification/blob/main/source/chapter3-devicenodes.rst
> > >
> > > And I'm not buying "low memory target" argument if you have enough memory
> > > to carve out ~600M for some mysterious clients.
> >
> > Just for your information, for low memory target, the carve out can be more
> > than ~60M out of 128M in total.
>
> If saving ~1M of memory map is important, hide the carve out from Linux
> entirely.
>
> > > > Let me know your comments on these.
> > > >
> > > > Thanks,
> > > > Vijay
> >
> > --
> > Thx and BRs,
> > Aiqun(Maria) Yu
>
> --
> Sincerely yours,
> Mike.



--
With best wishes
Dmitry