Re: [RFC PATCH v2] x86/sev: enforce RIP-relative accesses in early SEV/SME code

From: Ard Biesheuvel
Date: Wed Jan 17 2024 - 06:55:44 EST


On Wed, 17 Jan 2024 at 12:39, Andi Kleen <ak@xxxxxxxxxxxxxxx> wrote:
>
> On Wed, Jan 17, 2024 at 11:59:14AM +0100, Ard Biesheuvel wrote:
> > On Mon, 15 Jan 2024 at 21:47, Borislav Petkov <bp@xxxxxxxxx> wrote:
> > >
> > > On Thu, Jan 11, 2024 at 10:36:50PM +0000, Kevin Loughlin wrote:
> > > > SEV/SME code can execute prior to page table fixups for kernel
> > > > relocation. However, as with global variables accessed in
> > > > __startup_64(), the compiler is not required to generate RIP-relative
> > > > accesses for SEV/SME global variables, causing certain flavors of SEV
> > > > hosts and guests built with clang to crash during boot.
> > >
> > > So, about that. If I understand my gcc toolchain folks correctly:
> > >
> > > mcmodel=kernel - everything fits into the high 31 bit of the address
> > > space
> > >
> > > -fPIE/PIC - position independent
> > >
> > > And supplied both don't make a whole lotta of sense: if you're building
> > > position-independent, then mcmodel=kernel would be overridden by the
> > > first.
> > >
> > > I have no clue why clang enabled it...
> > >
> > > So, *actually* the proper fix here should be not to add this "fixed_up"
> > > gunk everywhere but remove mcmodel=kernel from the build and simply do
> > > -fPIE/PIC.
>
> For the SEV file this might not work because it also has functions
> that get called later at runtime, and may need to reference real
> globals. I doubt the linker could resolve that.
>

I don't think that should be a problem. If the code and data are
within -/+ 2G of each other, RIP-relative references should always be
in range.

> For linking the whole kernel, I haven't seen the latest numbers, but
> traditionally -fPIE/PIC cost some performance because globals get loaded
> through the GOT instead of directly as immediates. That's why the original
> x86-64 port went with -mcmodel=kernel.
>

We can tell the compiler to avoid the GOT (using 'hidden' visibility),
and even if we don't, the amd64 psABI now defines linker relaxations
that turn GOT loads into LEA instructions (which still bloat the code
a bit but eliminate the GOT accesses in most cases).