Re: [RFC 2/5] x86: Store a per-cpu shadow copy of CR4

From: Andy Lutomirski
Date: Thu Oct 16 2014 - 11:31:12 EST


On Oct 16, 2014 4:49 AM, "Borislav Petkov" <bp@xxxxxxxxx> wrote:
>
> On Tue, Oct 14, 2014 at 03:57:36PM -0700, Andy Lutomirski wrote:
> > Context switches and TLB flushes can change individual bits of CR4.
> > CR4 reads take several cycles, so store a shadow copy of CR4 in a
> > per-cpu variable.
> >
> > To avoid wasting a cache line, I added the CR4 shadow to
> > cpu_tlbstate, which is already touched during context switches.
>
> So does this even show in any workloads as any improvement?
>

Unlear. Assuming that mov from cr4 is more expensive than a cache
miss (which may or may not be true), then kernel TLB flushes will get
cheaper. The main reason I did this is to make switching TSD and PCE
a little cheaper, which might be worthwhile.

I think this may be a huge win on many workloads when running as an
SVM guest. IIUC SVM doesn't have accelerated guest CR4 access.

> Also, what's the rule with reading the shadow CR4? kvm only? Because
> svm_set_cr4() in svm.c reads the host CR4 too.

Whoops.

>
> Should we make all code access the shadow CR4 maybe...

That was the intent. In v2, I'll probably rename read_cr4 to
__read_cr4 to make it difficult to miss things.

>
> --
> Regards/Gruss,
> Boris.
> --
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/