Re: [PATCHv2] x86/trampoline: Bypass compat mode in trampoline_start64() if not needed

From: Kirill A. Shutemov
Date: Mon Jan 08 2024 - 16:58:42 EST


On Mon, Jan 08, 2024 at 08:18:55AM -0800, Sean Christopherson wrote:
> On Sun, Jan 07, 2024, Kirill A. Shutemov wrote:
> > @@ -220,6 +222,33 @@ SYM_CODE_START(trampoline_start64)
> > lidt tr_idt(%rip)
> > lgdt tr_gdt64(%rip)
> >
> > + /* Check if paging mode has to be changed */
> > + movq %cr4, %rax
> > + xorq tr_cr4(%rip), %rax
>
> This is buggy, tr_cr4 is only 4 bytes. And even if tr_cr4 were 8 bytes, the reason
> why nothing showed up in testing is also why only 4 bytes need to be XOR'd: the
> upper 32 bits of the result are never consumed.

Oh. Good catch. Will fix.

tr_cr4 will need to be changed to 8 bytes soonish. FRED uses bit 32 of the
register.

> > + andq $X86_CR4_LA57, %rax
>
> Nit, this can be TEST instead of AND, e.g. I was looking to see if RAX was used
> anywhere in the flow. And in theory it's possible a CPU could support uop fusing
> for TEST+Jcc but not AND+Jcc, cause shaving a cycle in this code is obviously
> super important ;-)
>
> And as above, testl, not testq.

Fair enough. Will use testl.

--
Kiryl Shutsemau / Kirill A. Shutemov