Re: Redoing eXclusive Page Frame Ownership (XPFO) with isolated CPUs in mind (for KVM to isolate its guests per CPU)

From: Julian Stecklina
Date: Wed Sep 12 2018 - 11:38:02 EST


Julian Stecklina <jsteckli@xxxxxxxxx> writes:

> Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:
>
>> On Fri, Aug 31, 2018 at 12:45 AM Julian Stecklina <jsteckli@xxxxxxxxx> wrote:
>>>
>>> I've been spending some cycles on the XPFO patch set this week. For the
>>> patch set as it was posted for v4.13, the performance overhead of
>>> compiling a Linux kernel is ~40% on x86_64[1]. The overhead comes almost
>>> completely from TLB flushing. If we can live with stale TLB entries
>>> allowing temporary access (which I think is reasonable), we can remove
>>> all TLB flushing (on x86). This reduces the overhead to 2-3% for
>>> kernel compile.
>>
>> I have to say, even 2-3% for a kernel compile sounds absolutely horrendous.
>
> Well, it's at least in a range where it doesn't look hopeless.
>
>> Kernel bullds are 90% user space at least for me, so a 2-3% slowdown
>> from a kernel is not some small unnoticeable thing.
>
> The overhead seems to come from the hooks that XPFO adds to
> alloc/free_pages. These hooks add a couple of atomic operations per
> allocated (4K) page for book keeping. Some of these atomic ops are only
> for debugging and could be removed. There is also some opportunity to
> streamline the per-page space overhead of XPFO.

I've updated my XPFO branch[1] to make some of the debugging optional
and also integrated the XPFO bookkeeping with struct page, instead of
requiring CONFIG_PAGE_EXTENSION, which removes some checks in the hot
path. These changes push the overhead down to somewhere between 1.5 and
2% for my quad core box in kernel compile. This is close to the
measurement noise, so I take suggestions for a better benchmark here.

Of course, if you hit contention on the xpfo spinlock then performance
will suffer. I guess this is what happened on Khalid's large box.

I'll try to remove the spinlocks and add fixup code to the pagefault
handler to see whether this improves the situation on large boxes. This
might turn out to be ugly, though.

Julian

[1] http://git.infradead.org/users/jsteckli/linux-xpfo.git/shortlog/refs/heads/xpfo-master
--
Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B