Re: [PATCH v3 5/6] mm: Add mirror flag back on initrd memory

From: mawupeng
Date: Thu Jun 09 2022 - 04:16:06 EST




在 2022/6/8 18:12, Ard Biesheuvel 写道:
On Wed, 8 Jun 2022 at 12:08, David Hildenbrand <david@xxxxxxxxxx> wrote:

On 08.06.22 12:02, Mike Rapoport wrote:
On Wed, Jun 08, 2022 at 03:27:09PM +0800, mawupeng wrote:

在 2022/6/7 22:49, Ard Biesheuvel 写道:
On Tue, 7 Jun 2022 at 14:22, David Hildenbrand <david@xxxxxxxxxx> wrote:

On 07.06.22 11:38, Wupeng Ma wrote:
From: Ma Wupeng <mawupeng1@xxxxxxxxxx>

Initrd memory will be removed and then added in arm64_memblock_init() and this
will cause it to lose all of its memblock flags. The lost of MEMBLOCK_MIRROR
flag will lead to error log printed by find_zone_movable_pfns_for_nodes if
the lower 4G range has some non-mirrored memory.

In order to solve this problem, the lost MEMBLOCK_MIRROR flag will be
reinstalled if the origin memblock has this flag.

Signed-off-by: Ma Wupeng <mawupeng1@xxxxxxxxxx>
---
arch/arm64/mm/init.c | 9 +++++++++
include/linux/memblock.h | 1 +
mm/memblock.c | 20 ++++++++++++++++++++
3 files changed, 30 insertions(+)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 339ee84e5a61..11641f924d08 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -350,9 +350,18 @@ void __init arm64_memblock_init(void)
"initrd not fully accessible via the linear mapping -- please check your bootloader ...\n")) {
phys_initrd_size = 0;
} else {
+ int flags, ret;
+
+ ret = memblock_get_flags(base, &flags);
+ if (ret)
+ flags = 0;
+
memblock_remove(base, size); /* clear MEMBLOCK_ flags */
memblock_add(base, size);
memblock_reserve(base, size);

Can you explain why we're removing+re-adding here exactly? Is it just to
clear flags as the comment indicates?


This should only happen if the placement of the initrd conflicts with
a mem= command line parameter or it is not covered by memblock for
some other reason.

IOW, this should never happen, and if re-memblock_add'ing this memory
unconditionally is causing problems, we should fix that instead of
working around it.

This will happen if we use initrdmem=3G,100M to reserve initrd memory below
the 4G limit to test this scenario(just for testing, I have trouble to boot
qemu with initrd enabled and memory below 4G are all mirror memory).

Re-memblock_add'ing this memory unconditionally seems fine but clear all
flags(especially MEMBLOCK_MIRROR) may lead to some error log.


If it's really just about clearing flags, I wonder if we rather want to
have an interface that does exactly that, and hides the way this is
actually implemented (obtain flags, remove, re-add ...), internally.

But most probably there is more magic in the code and clearing flags
isn't all it ends up doing.


I don't remember exactly why we needed to clear the flags, but I think
it had to do with some corner case we hit when the initrd was
partially covered.
If "mem=" is set in command line, memblock_mem_limit_remove_map() will
remove all memory block without MEMBLOCK_NOMAP. Maybe this will bring the
memory back if this initrd mem has the MEMBLOCK_NOMAP flag?

The rfc version [1] introduce and use memblock_clear_nomap() to clear the
MEMBLOCK_NOMAP of this initrd memblock.
So maybe the usage of memblock_remove() is just to avoid introducing new
function(memblock_clear_nomap)?

Since commit 4c546b8a3469 ("memblock: add memblock_clear_nomap()") already
introduced memblock_clear_nomap(). Can we use this to remove flag MEMBLOCK_NOMAP
to solve this problem rather than bring flag MEMBLOCK_MIRROR back?

AFAICT, there are two corner cases that re-adding initrd memory covers:
* initrd memory is not a part of the memory reported to memblock, either
because of firmware weirdness or because it was cut out with mem=
* initrd memory overlaps a NOMAP region

So to make sure initrd memory is mapped properly and retains
MEMBLOCK_MIRROR I think the best we can do is

memblock_add();
memblock_clear_nomap();
memblock_reserve();

Would simply detect+rejecting to boot on such setups be an option? The
replies so far indicate to me that this is rather a corner case than a
reasonable use case.


The sad reality is that mem= is known to be used in production for
limiting the amount of memory that the kernel takes control of, in
order to allow the remainder to be used in platform specific ways.

Of course, there are much better ways to achieve that, but given that
we currently support it, I don't think we can easily back that out.

I do think that there is no need to go out of our way to make this
case work seamlessly with mirrored memory, though. So I'd prefer to
make the remove+re-add conditional on there actually being a need to
do so. That way, we don't break the old use case or mirrored memory,
and whatever happens when the two are combined is DONTCARE.

Does that mean that we don't need to care about this scenario with
mirror memory?

Thanks for reviewing.

.