RE: [PATCH 1/1] x86/hyperv: Disable IBT when hypercall page lacks ENDBR instruction

From: Michael Kelley (LINUX)
Date: Fri Jul 21 2023 - 10:00:49 EST


From: Peter Zijlstra <peterz@xxxxxxxxxxxxx> Sent: Friday, July 21, 2023 12:59 AM
>
> On Fri, Jul 21, 2023 at 12:41:35AM +0000, Michael Kelley (LINUX) wrote:
>
> > > Other than that, this seems fairly straight forward. One thing I
> > > wondered about; wouldn't it be possible to re-write the indirect
> > > hypercall thingies to a direct call? I mean, once we have the hypercall
> > > page mapped, the address is known right?
> >
> > Yes, the address is known. It does not change across things like
> > hibernation. But the indirect call instruction is part of an inline assembly
> > sequence, so the call instructions that need re-writing are scattered
> > throughout the code. There's also the SEV-SNP case from the
> > latest version of Tianyu Lan's patch set [1] where vmmcall may be used
> > instead, based on your recent enhancement for nested ALTERNATIVE.
> > Re-writing seems like that's more complexity than warranted for a
> > mostly interim situation until the Hyper-V patch is available and
> > users install it.
>
> Well, we have a lot of infrastructure for this already. Specifically
> this is very like the paravirt patching.
>
> Also, direct calls are both faster and have less speculation issues, so
> it might still be worth looking at.
>
> The way to do something like this would be:
>
>
> asm volatile (" ANNOTATE_RETPOLINE_SAFE \n\t"
> "1: call *hv_hypercall_page \n\t"
> ".pushsection .hv_call_sites \n\t"
> ".long 1b - . \n\t"
> ".popsection \n\t");
>
>
> And then (see alternative.c for many other examples):
>
>
> patch_hypercalls()
> {
> s32 *s;
>
> for (s = __hv_call_sites_begin; s < __hv_call_sites_end; s++) {
> void *addr = (void *)s + *s;
> struct insn insn;
>
> ret = insn_decode_kernel(&insn, addr);
> if (WARN_ON_ONCE(ret < 0))
> continue;
>
> /*
> * indirect call: ff 15 disp32
> * direct call: 2e e8 disp32
> */
> if (insn.length == 6 &&
> insn.opcode.bytes[0] == 0xFF &&
> X86_MODRM_REG(insn.modrm.bytes[0]) == 2) {
>
> /* verify it was calling hy_hypercall_page */
> if (WARN_ON_ONCE(addr + 6 + insn.displacement.value != &hv_hypercall_page))
> continue;
>
> /*
> * write a CS padded direct call -- assumes the
> * hypercall page is in the 2G immediate range
> * of the kernel text

Probably not true -- the hypercall page has a vmalloc address.

> */
> addr[0] = 0x2e; /* CS prefix */
> addr[1] = CALL_INSN_OPCODE;
> (s32 *)&Addr[2] = *hv_hypercall_page - (addr + 6);
> }
> }
> }
>
>
> See, easy :-)

OK, worth looking into. This is a corner of the Linux kernel code that
I've never looked at before. I appreciate the pointers.

Hypercall sites also exist in loadable modules, so would need to hook
into module_finalize() as well. Processing a new section type looks
straightforward.

But altogether, this feels like more change than should go as a bug
fix to be backported to stable kernels. It's something to look at for a
future kernel release.

Michael