Re: [PATCH v5] arm64: kdump: simplify the reservation behaviour of crashkernel=,high

From: Baoquan He
Date: Thu Apr 13 2023 - 03:47:04 EST


On 04/12/23 at 12:51pm, Catalin Marinas wrote:
> On Fri, Apr 07, 2023 at 10:32:38AM +0800, Baoquan He wrote:
> > On 04/07/23 at 10:24am, Baoquan He wrote:
> > ......
> > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> > > index 66e70ca47680..307263c01292 100644
> > > --- a/arch/arm64/mm/init.c
> > > +++ b/arch/arm64/mm/init.c
> > > @@ -69,6 +69,7 @@ phys_addr_t __ro_after_init arm64_dma_phys_limit;
> > >
> > > #define CRASH_ADDR_LOW_MAX arm64_dma_phys_limit
> > > #define CRASH_ADDR_HIGH_MAX (PHYS_MASK + 1)
> > > +#define CRASH_HIGH_SEARCH_BASE SZ_4G
> > >
> > > #define DEFAULT_CRASH_KERNEL_LOW_SIZE (128UL << 20)
> > >
> > > @@ -101,12 +102,13 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
> > > */
> > > static void __init reserve_crashkernel(void)
> > > {
> > > - unsigned long long crash_base, crash_size;
> > > - unsigned long long crash_low_size = 0;
> > > + unsigned long long crash_base, crash_size, search_base;
> > > unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
> > > + unsigned long long crash_low_size = 0;
> > > char *cmdline = boot_command_line;
> > > - int ret;
> > > bool fixed_base = false;
> > > + bool high = false;
> > > + int ret;
> > >
> > > if (!IS_ENABLED(CONFIG_KEXEC_CORE))
> > > return;
> > > @@ -129,7 +131,9 @@ static void __init reserve_crashkernel(void)
> > > else if (ret)
> > > return;
> > >
> > > + search_base = CRASH_HIGH_SEARCH_BASE;
> >
> > Here, I am hesitant if a conditional check is needed as below. On
> > special system where both CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32
> > are disabled, there's only low memory, means its arm64_dma_phys_limit
> > equals to (PHYS_MASK + 1). In this case, whatever the crashkernel= is,
> > it can search the whole system memory for available crashkernel region.
> > Maybe it's fine since it's not big deal, the memory regoin can be found
> > anyway.
> >
> > crash_max = CRASH_ADDR_HIGH_MAX;
> > if (crash_max != CRASH_ADDR_LOW_MAX)
> > search_base = CRASH_HIGH_SEARCH_BASE;
>
> Does x86 do anything different here or they just can't disable
> ZONE_DMA32? I'd be tempted to instead define CRASH_ADDR_LOW_MAX as
> min(SZ_4G, arm64_dma_phys_limit) so that the crashkernel=,high semantics
> are still preserved irrespective of how the kernel was built.

x86 defaults to have both ZONE_DMA and ZONE_DMA32, and hardcode the zone
upper limit. I think it's not easy to disable ZONE_DMA32. Otherwise, the
device can only grab memory from DMA zone which is only 16MB big. The
rest memory will enter into normal zone. Please see below code snippet.

arch/x86/include/asm/dma.h:
/* 16MB ISA DMA zone */
#define MAX_DMA_PFN ((16UL * 1024 * 1024) >> PAGE_SHIFT)

/* 4GB broken PCI/AGP hardware bus master zone */
#define MAX_DMA32_PFN (1UL << (32 - PAGE_SHIFT))

arch/x86/mm/init.c
zone_sizes_init()

About CRASH_ADDR_LOW_MAX definition, with min(SZ_4G,
arm64_dma_phys_limit), it's similar to defining it as
arm64_dma_phys_limit directly.
it's the same as defining CRASH_ADDR_LOW_MAX as arm64_dma_phys_limit.

Because arm64_dma_phys_limit has three kinds of value:
1) 1G on RPi4
2) 4G on normal system
3) (PHYS_MASK + 1) on special system w/o zone DMA and DMA32

For the first two types, min(SZ_4G, arm64_dma_phys_limit) is
arm64_dma_phys_limit. While for the 3rd one, CRASH_ADDR_LOW_MAX is 4G, but
it will make type 3) system not be able to reserve memory across 4G.
However, on type 3) system, all its memory is low memory, the 4G should
not be a boundary. I tried code change with min(SZ_4G,
arm64_dma_phys_limit), if we can accept this, the v4 patch looks more
appripriate except that RPi4 has inconsistent behaviour when
crashkernel=,high is specified.

>
> There's also the difference between what the current kernel vs the kdump
> kernel. I don't think there's a strong requirement that they have the
> same config options, in which case it may be safer to just honour the 4G
> boundary.

Yes. In principle, kdump kerel doesn't have to be the same as the current
kernel. However, in reality, we usually take the same kernel for the
current kernel and kdump kernel. In our distros, we do that by default.
While user can choose different kernel as kdump kernel.

But, we may not suggest user to take kernel with different config
options as kdump kernel. E.g on RPi4, the current kernel has ZONE_DMA
and ZONE_DMA32 enabled, then it has DMA zone under 1G, DMA32 zone under
4G. If we take a kernel without ZONE_DMA and ZONE_DMA32 set as kdump
kernel, it's very likely not working because the pci device could get
memory above 1G, even above 4G.

>
> Otherwise the patch looks fine. Whether you want to add the min limit
> above:

I am OK with this version, or the version with min(SZ_4G,
arm64_dma_phys_limit), or v4. Please help point out if I got your idea
correctly. Thanks a lot.