Re: [PATCH v6 RESEND 1/2] arm64: kdump: simplify the reservation behaviour of crashkernel=,high

From: Leizhen (ThunderTown)
Date: Tue Jun 06 2023 - 07:53:56 EST




On 2023/5/15 17:54, Baoquan He wrote:
> On arm64, reservation for 'crashkernel=xM,high' is taken by searching for
> suitable memory region top down. If the 'xM' of crashkernel high memory
> is reserved from high memory successfully, it will try to reserve
> crashkernel low memory later accoringly. Otherwise, it will try to search
> low memory area for the 'xM' suitable region. Please see the details in
> Documentation/admin-guide/kernel-parameters.txt.
>
> While we observed an unexpected case where a reserved region crosses the
> high and low meomry boundary. E.g on a system with 4G as low memory end,
> user added the kernel parameters like: 'crashkernel=512M,high', it could
> finally have [4G-126M, 4G+386M], [1G, 1G+128M] regions in running kernel.
> The crashkernel high region crossing low and high memory boudary will bring
> issues:
>
> 1) For crashkernel=x,high, if getting crashkernel high region across
> low and high memory boundary, then user will see two memory regions in
> low memory, and one memory region in high memory. The two crashkernel
> low memory regions are confusing as shown in above example.
>
> 2) If people explicityly specify "crashkernel=x,high crashkernel=y,low"
> and y <= 128M, when crashkernel high region crosses low and high memory
> boundary and the part of crashkernel high reservation below boundary is
> bigger than y, the expected crahskernel low reservation will be skipped.
> But the expected crashkernel high reservation is shrank and could not
> satisfy user space requirement.
>
> 3) The crossing boundary behaviour of crahskernel high reservation is
> different than x86 arch. On x86_64, the low memory end is 4G fixedly,
> and the memory near 4G is reserved by system, e.g for mapping firmware,
> pci mapping, so the crashkernel reservation crossing boundary never happens.
>>From distros point of view, this brings inconsistency and confusion. Users
> need to dig into x86 and arm64 system details to find out why.
>
> For kernel itself, the impact of issue 3) could be slight. While issue
> 1) and 2) cause actual impact because it brings obscure semantics and
> behaviour to crashkernel=,high reservation.
>
> Here, for crashkernel=xM,high, search the high memory for the suitable
> region only in high memory. If failed, try reserving the suitable
> region only in low memory. Like this, the crashkernel high region will
> only exist in high memory, and crashkernel low region only exists in low
> memory. The reservation behaviour for crashkernel=,high is clearer and
> simpler.
>
> Note: RPi4 has different zone ranges than normal memory. Its DMA zone is
> 0~1G, and DMA32 zone is 1G~4G if CONFIG_ZONE_DMA|DMA32 are enabled by
> default. The low memory end is 1G in order to validate all devices, high
> memory starts at 1G memory. However, for being consistent with normla

normla --> normal

> arm64 system, its low memory end is still 1G, while reserving crashkernel
> high memory from 4G if crashkernel=size,high specified. This will remove
> confusion.

Reviewed-by: Zhen Lei <thunder.leizhen@xxxxxxxxxx>

>
> With above change applied, summary of arm64 crashkernel reservation range:
> 1)
> RPi4(zone DMA:0~1G; DMA32:1G~4G):
> crashkernel=size
> 0~1G: low memory | 1G~top: high memory
>
> crashkernel=size,high
> 0~1G: low memory | 4G~top: high memory
>
> 2)
> Other normal system:
> crashkernel=size
> crashkernel=size,high
> 0~4G: low memory | 4G~top: high memory
>
> 3)
> Systems w/o zone DMA|DMA32
> crashkernel=size
> crashkernel=size,high
> 0~top: low memory
>
> Signed-off-by: Baoquan He <bhe@xxxxxxxxxx>
> Reviewed-by: Catalin Marinas <catalin.marinas@xxxxxxx>
> ---
> v6-RESEND:
> - Remove the relic of local patch merging at the end of patch log.
> - Add Catalin's Reviewed-by tag.
>
> arch/arm64/mm/init.c | 44 ++++++++++++++++++++++++++++++++++----------
> 1 file changed, 34 insertions(+), 10 deletions(-)
>
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 66e70ca47680..c28c2c8483cc 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -69,6 +69,7 @@ phys_addr_t __ro_after_init arm64_dma_phys_limit;
>
> #define CRASH_ADDR_LOW_MAX arm64_dma_phys_limit
> #define CRASH_ADDR_HIGH_MAX (PHYS_MASK + 1)
> +#define CRASH_HIGH_SEARCH_BASE SZ_4G
>
> #define DEFAULT_CRASH_KERNEL_LOW_SIZE (128UL << 20)
>
> @@ -101,12 +102,13 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
> */
> static void __init reserve_crashkernel(void)
> {
> - unsigned long long crash_base, crash_size;
> - unsigned long long crash_low_size = 0;
> + unsigned long long crash_low_size = 0, search_base = 0;
> unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
> + unsigned long long crash_base, crash_size;
> char *cmdline = boot_command_line;
> - int ret;
> bool fixed_base = false;
> + bool high = false;
> + int ret;
>
> if (!IS_ENABLED(CONFIG_KEXEC_CORE))
> return;
> @@ -129,7 +131,9 @@ static void __init reserve_crashkernel(void)
> else if (ret)
> return;
>
> + search_base = CRASH_HIGH_SEARCH_BASE;
> crash_max = CRASH_ADDR_HIGH_MAX;
> + high = true;
> } else if (ret || !crash_size) {
> /* The specified value is invalid */
> return;
> @@ -140,31 +144,51 @@ static void __init reserve_crashkernel(void)
> /* User specifies base address explicitly. */
> if (crash_base) {
> fixed_base = true;
> + search_base = crash_base;
> crash_max = crash_base + crash_size;
> }
>
> retry:
> crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
> - crash_base, crash_max);
> + search_base, crash_max);
> if (!crash_base) {
> /*
> - * If the first attempt was for low memory, fall back to
> - * high memory, the minimum required low memory will be
> - * reserved later.
> + * For crashkernel=size[KMG]@offset[KMG], print out failure
> + * message if can't reserve the specified region.
> */
> - if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
> + if (fixed_base) {
> + pr_warn("crashkernel reservation failed - memory is in use.\n");
> + return;
> + }
> +
> + /*
> + * For crashkernel=size[KMG], if the first attempt was for
> + * low memory, fall back to high memory, the minimum required
> + * low memory will be reserved later.
> + */
> + if (!high && crash_max == CRASH_ADDR_LOW_MAX) {
> crash_max = CRASH_ADDR_HIGH_MAX;
> + search_base = CRASH_ADDR_LOW_MAX;
> crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
> goto retry;
> }
>
> + /*
> + * For crashkernel=size[KMG],high, if the first attempt was
> + * for high memory, fall back to low memory.
> + */
> + if (high && crash_max == CRASH_ADDR_HIGH_MAX) {
> + crash_max = CRASH_ADDR_LOW_MAX;
> + search_base = 0;
> + goto retry;
> + }
> pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
> crash_size);
> return;
> }
>
> - if ((crash_base > CRASH_ADDR_LOW_MAX - crash_low_size) &&
> - crash_low_size && reserve_crashkernel_low(crash_low_size)) {
> + if ((crash_base >= CRASH_ADDR_LOW_MAX) && crash_low_size &&
> + reserve_crashkernel_low(crash_low_size)) {
> memblock_phys_free(crash_base, crash_size);
> return;
> }
>

--
Regards,
Zhen Lei