Re: [RFC PATCH 0/3] Weight-balanced binary tree + KVM growable memoryslots using wbtree

From: Avi Kivity
Date: Thu Feb 24 2011 - 05:05:03 EST


On 02/23/2011 08:06 PM, Alex Williamson wrote:
On Wed, 2011-02-23 at 15:12 +0200, Avi Kivity wrote:
> On 02/22/2011 08:54 PM, Alex Williamson wrote:
> > This series introduces a new weight-balanced binary tree (wbtree) for
> > general use. It's largely leveraged from the rbtree, copying it's
> > rotate functions, while introducing different rebalance and erase
> > functions. This tree is particularly useful for managing memory
> > ranges, where it's desirable to have the most likely targets (the
> > largest ranges) at the top of each subtree.
> >
> > Patches 2& 3 go on to convert the KVM memory slots to a growable
> > array and make use of wbtree for efficient managment. Trying to
> > exercise the worst case for this data structure, I ran netperf
> > TCP_RR on an emulated rtl8139 NIC connected directly to the host
> > via a tap. Both qemu-kvm and the netserver on the host were
> > pinned to optimal CPUs with taskset. This series resulted in
> > a 3% improvement for this test.
> >
>
> In this case, I think most of the faults (at least after the guest was
> warmed up) missed the tree completely.

Except for the mmio faults for the NIC, which will traverse the entire
depth of that branch of the tree for every access.

That is exactly what I meant: most of the faults cause the search to fail. What we want here is to cache the success/fail result of the search so we don't do it in the first place.

> In this case a weight balanced
> tree is hardly optimal (it is optimized for hits), so I think you'll see
> a bigger gain from the mmio fault optimization. You'll probably see
> most of the gain running mmu intensive tests with ept=0.

Right, the gain expected by this test is that we're only traversing 6-7
tree nodes until we don't find a match, versus the full 32 entries of
the original memslot array. So it's effectively comparing worst case
scenarios for both data structures.

If we optimized the linear list we'd sort it by size, descending, and limit it by the number of instantiated slots, which I think would beat the tree.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/