Re: mainline/master bisection: baseline.login on meson-sm1-khadas-vim3l

From: Guillaume Tucker
Date: Tue Feb 23 2021 - 16:06:14 EST


On 23/02/2021 14:18, Marc Zyngier wrote:
> Hi Guillaume,
>
> On Tue, 23 Feb 2021 09:46:30 +0000,
> Guillaume Tucker <guillaume.tucker@xxxxxxxxxxxxx> wrote:
>>
>> Hello Marc,
>>
>> Please see the bisection report below about a boot failure on
>> meson-sm1-khadas-vim3l on mainline. It seems to only be
>> affecting kernels built with CONFIG_ARM64_64K_PAGES=y.
>>
>> Reports aren't automatically sent to the public while we're
>> trialing new bisection features on kernelci.org but this one
>> looks valid.
>>
>> There's no output in the log, so the kernel is most likely
>> crashing early. Some more details can be found here:
>>
>> https://kernelci.org/test/case/id/6034bed3b344e2860daddcc8/
>>
>> Please let us know if you need any help to debug the issue or try
>> a fix on this platform.
>
> Thanks for the heads up.
>
> There is actually a fundamental problem with the patch you bisected
> to: it provides no guarantee that the point where we enable the EL2
> MMU is in the idmap and, as it turns out, the code we're running from
> disappears from under our feet, leading to a translation fault we're
> not prepared to handle.
>
> How does it work with 4kB pages? Luck.

There may be a fascinating explanation for it, but luck works
too. It really seems to be booting happily with 4k pages:

https://kernelci.org/test/plan/id/60347b358de339d1b7addcc5/

> Do you mind giving the patch below a go? It does work on my vim3l and
> on a FVP, so odds are that it will solve it for you too.

Sure, and that worked here as well:

http://lava.baylibre.com:10080/scheduler/job/752416

and here's the test branch where I applied your fix, for
completeness:

https://gitlab.collabora.com/gtucker/linux/-/commits/v5.11-vim3l-vhe/

As always, if you do send a patch with the fix, please give some
credit to the bot:

Reported-by: "kernelci.org bot" <bot@xxxxxxxxxxxx>

Thanks,
Guillaume


> Thanks,
>
> M.
>
> diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
> index 678cd2c618ee..fbd2543b8f7d 100644
> --- a/arch/arm64/kernel/hyp-stub.S
> +++ b/arch/arm64/kernel/hyp-stub.S
> @@ -96,8 +96,10 @@ SYM_CODE_START_LOCAL(mutate_to_vhe)
> cmp x1, xzr
> and x2, x2, x1
> csinv x2, x2, xzr, ne
> - cbz x2, 1f
> + cbnz x2, 2f
>
> +1: eret
> +2:
> // Engage the VHE magic!
> mov_q x0, HCR_HOST_VHE_FLAGS
> msr hcr_el2, x0
> @@ -131,11 +133,29 @@ SYM_CODE_START_LOCAL(mutate_to_vhe)
> msr mair_el1, x0
> isb
>
> + // Hack the exception return to stay at EL2
> + mrs x0, spsr_el1
> + and x0, x0, #~PSR_MODE_MASK
> + mov x1, #PSR_MODE_EL2h
> + orr x0, x0, x1
> + msr spsr_el1, x0
> +
> + b enter_vhe
> +SYM_CODE_END(mutate_to_vhe)
> +
> + // At the point where we reach enter_vhe(), we run with
> + // the MMU off (which is enforced by mutate_to_vhe()).
> + // We thus need to be in the idmap, or everything will
> + // explode when enabling the MMU.
> +
> + .pushsection .idmap.text, "ax"
> +
> +SYM_CODE_START_LOCAL(enter_vhe)
> + // Enable the EL2 S1 MMU, as set up from EL1
> // Invalidate TLBs before enabling the MMU
> tlbi vmalle1
> dsb nsh
>
> - // Enable the EL2 S1 MMU, as set up from EL1
> mrs_s x0, SYS_SCTLR_EL12
> set_sctlr_el1 x0
>
> @@ -143,17 +163,12 @@ SYM_CODE_START_LOCAL(mutate_to_vhe)
> mov_q x0, INIT_SCTLR_EL1_MMU_OFF
> msr_s SYS_SCTLR_EL12, x0
>
> - // Hack the exception return to stay at EL2
> - mrs x0, spsr_el1
> - and x0, x0, #~PSR_MODE_MASK
> - mov x1, #PSR_MODE_EL2h
> - orr x0, x0, x1
> - msr spsr_el1, x0
> -
> mov x0, xzr
>
> -1: eret
> -SYM_CODE_END(mutate_to_vhe)
> + eret
> +SYM_CODE_END(enter_vhe)
> +
> + .popsection
>
> .macro invalid_vector label
> SYM_CODE_START_LOCAL(\label)
>
>