Re: [PATCHv2] x86/trampoline: Bypass compat mode in trampoline_start64() if not needed

From: Sean Christopherson
Date: Mon Jan 08 2024 - 11:19:06 EST


On Sun, Jan 07, 2024, Kirill A. Shutemov wrote:
> @@ -220,6 +222,33 @@ SYM_CODE_START(trampoline_start64)
> lidt tr_idt(%rip)
> lgdt tr_gdt64(%rip)
>
> + /* Check if paging mode has to be changed */
> + movq %cr4, %rax
> + xorq tr_cr4(%rip), %rax

This is buggy, tr_cr4 is only 4 bytes. And even if tr_cr4 were 8 bytes, the reason
why nothing showed up in testing is also why only 4 bytes need to be XOR'd: the
upper 32 bits of the result are never consumed.

> + andq $X86_CR4_LA57, %rax

Nit, this can be TEST instead of AND, e.g. I was looking to see if RAX was used
anywhere in the flow. And in theory it's possible a CPU could support uop fusing
for TEST+Jcc but not AND+Jcc, cause shaving a cycle in this code is obviously
super important ;-)

And as above, testl, not testq.

> + jnz .L_switch_paging
> +
> + /* Paging mode is correct proceed in 64-bit mode */
> +
> + LOCK_AND_LOAD_REALMODE_ESP lock_rip=1
> +
> + movw $__KERNEL_DS, %dx
> + movl %edx, %ss
> + addl $pa_real_mode_base, %esp
> + movl %edx, %ds
> + movl %edx, %es
> + movl %edx, %fs
> + movl %edx, %gs
> +
> + movl $pa_trampoline_pgd, %eax
> + movq %rax, %cr3
> +
> + jmpq *tr_start(%rip)
> +.L_switch_paging:
> + /*
> + * To switch between 4- and 5-level paging modes, it is necessary
> + * to disable paging. This must be done in the compatibility mode.
> + */
> ljmpl *tr_compat(%rip)
> SYM_CODE_END(trampoline_start64)
>
> --
> 2.41.0
>