Re: [PATCHv2] x86/trampoline: Bypass compat mode in trampoline_start64() if not needed

From: kirill.shutemov@xxxxxxxxxxxxxxx
Date: Mon Jan 08 2024 - 08:33:27 EST


On Mon, Jan 08, 2024 at 01:10:31PM +0000, Huang, Kai wrote:
> On Sun, 2024-01-07 at 15:28 +0300, Kirill A. Shutemov wrote:
> > The trampoline_start64() vector is used when a secondary CPU starts in
> > 64-bit mode. The current implementation directly enters compatibility
> > mode. It is necessary to disable paging and re-enable it in the correct
> > paging mode: either 4- or 5-level, depending on the configuration.
> >
> > The X86S[1] ISA does not support compatibility mode in ring 0, and
> > paging cannot be disabled.
> >
> > The trampoline_start64() function is reworked to only enter compatibility
> > mode if it is necessary to change the paging mode. If the CPU is already
> > in the desired paging mode, it will proceed in long mode.
> >
> > This change will allow a secondary CPU to boot on an X86S machine as
> > long as the CPU is already in the correct paging mode.
> >
> > In the future, there will be a mechanism to switch between paging modes
> > without disabling paging.
> >
> > [1] https://www.intel.com/content/www/us/en/developer/articles/technical/envisioning-future-simplified-architecture.html
> >
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
> > Reviewed-by: Andi Kleen <ak@xxxxxxxxxxxxxxx>
> > Cc: Sean Christopherson <seanjc@xxxxxxxxxx>
> >
> > ---
> > v2:
> > - Fix build with GCC;
> > ---
> > arch/x86/realmode/rm/trampoline_64.S | 31 +++++++++++++++++++++++++++-
> > 1 file changed, 30 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/realmode/rm/trampoline_64.S b/arch/x86/realmode/rm/trampoline_64.S
> > index c9f76fae902e..c07354542188 100644
> > --- a/arch/x86/realmode/rm/trampoline_64.S
> > +++ b/arch/x86/realmode/rm/trampoline_64.S
> > @@ -37,13 +37,15 @@
> > .text
> > .code16
> >
> > -.macro LOCK_AND_LOAD_REALMODE_ESP lock_pa=0
> > +.macro LOCK_AND_LOAD_REALMODE_ESP lock_pa=0 lock_rip=0
> > /*
> > * Make sure only one CPU fiddles with the realmode stack
> > */
> > .Llock_rm\@:
> > .if \lock_pa
> > lock btsl $0, pa_tr_lock
> > + .elseif \lock_rip
> > + lock btsl $0, tr_lock(%rip)
> > .else
> > lock btsl $0, tr_lock
> > .endif
> > @@ -220,6 +222,33 @@ SYM_CODE_START(trampoline_start64)
> > lidt tr_idt(%rip)
> > lgdt tr_gdt64(%rip)
> >
> > + /* Check if paging mode has to be changed */
> > + movq %cr4, %rax
> > + xorq tr_cr4(%rip), %rax
> > + andq $X86_CR4_LA57, %rax
> > + jnz .L_switch_paging
>
> This seems depends on the BIOS will always use 4-level paging. Can we make such
> assumption?

What makes you think this?

The check is basically

if ((tr_cr4 ^ CR4) & X86_CR4_LA57)
goto .L_switch_paging;

It means if LA57 is not the same between tr_cr4 and CR4 we need to change
paging mode.

> > +
> > + /* Paging mode is correct proceed in 64-bit mode */
> > +
> > + LOCK_AND_LOAD_REALMODE_ESP lock_rip=1
> > +
> > + movw $__KERNEL_DS, %dx
> > + movl %edx, %ss
> > + addl $pa_real_mode_base, %esp
> > + movl %edx, %ds
> > + movl %edx, %es
> > + movl %edx, %fs
> > + movl %edx, %gs
> > +
> > + movl $pa_trampoline_pgd, %eax
> > + movq %rax, %cr3
> > +
> > + jmpq *tr_start(%rip)
>
> IIUC you won't be using __KERNEL_CS in this case? Not sure whether this matters
> though, because the spec says in 64-bit mode the hardware treats CS,DS,ES,SS as
> zero.
>

secondary_startup_64() will set CS to __KERNEL_CS before jumping to C
code.

> > +.L_switch_paging:
> > + /*
> > + * To switch between 4- and 5-level paging modes, it is necessary
> > + * to disable paging. This must be done in the compatibility mode.
> > + */
> > ljmpl *tr_compat(%rip)
> > SYM_CODE_END(trampoline_start64)
> >
>

--
Kiryl Shutsemau / Kirill A. Shutemov