Re: x86 TLB flushing: INVPCID vs. deferred CR3 write

From: Ingo Molnar
Date: Wed Dec 06 2017 - 12:33:25 EST



* Dave Hansen <dave.hansen@xxxxxxxxx> wrote:

> tl;dr: Kernels with pagetable isolation using INVPCID compile kernels
> 0.58% faster than using the deferred CR3 write. This tends to say that
> we should leave things as-is and keep using INVPCID, but it's far from
> definitive.

Agreed, thanks for the detailed testing!

> If folks have better ideas for a test methodology, or specific workloads or
> hardware where you want to see this tested, please speak up.

I had a look at the numbers and it all looks valid and good to me too - it's also
the intuitive result IMHO.

I suspect there might be synthetic cache-hot workloads where the +330 cycles cost
of INVPCID is higher than that of the extra TLB miss costs of a CR3 flush - but we
do know that this offset is constant, while the cost of flushing all TLBs ever
increases with the future increases of the TLB cache.

Thanks,

Ingo