Re: [PATCH] virtio-ring: Use threshold for switching to indirectdescriptors

From: Michael S. Tsirkin
Date: Sun Dec 04 2011 - 10:13:54 EST


On Sun, Dec 04, 2011 at 01:06:37PM +0200, Michael S. Tsirkin wrote:
> On Sat, Dec 03, 2011 at 01:50:28PM +0200, Sasha Levin wrote:
> > On Fri, 2011-12-02 at 11:16 +1030, Rusty Russell wrote:
> > > On Thu, 1 Dec 2011 12:26:42 +0200, "Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote:
> > > > On Thu, Dec 01, 2011 at 10:09:37AM +0200, Sasha Levin wrote:
> > > > > On Thu, 2011-12-01 at 09:58 +0200, Michael S. Tsirkin wrote:
> > > > > > We'll presumably need some logic to increment is back,
> > > > > > to account for random workload changes.
> > > > > > Something like slow start?
> > > > >
> > > > > We can increment it each time the queue was less than 10% full, it
> > > > > should act like slow start, no?
> > > >
> > > > No, we really shouldn't get an empty ring as long as things behave
> > > > well. What I meant is something like:
> > >
> > > I was thinking of the network output case, but you're right. We need to
> > > distinguish between usually full (eg. virtio-net input) and usually
> > > empty (eg. virtio-net output).
> > >
> > > The signal for "we to pack more into the ring" is different. We could
> > > use some hacky heuristic like "out == 0" but I'd rather make it explicit
> > > when we set up the virtqueue.
> > >
> > > Our other alternative, moving the logic to the driver, is worse.
> > >
> > > As to fading the effect over time, that's harder. We have to deplete
> > > the ring quite a few times before it turns into always-indirect. We
> > > could back off every time the ring is totally idle, but that may hurt
> > > bursty traffic. Let's try simple first?
> >
> > I tried to take a different approach, and tried putting the indirect
> > descriptors in a kmem_cache as Michael suggested. The benchmarks showed
> > that this way virtio-net actually worked faster with indirect on even in
> > a single stream.
> >
> > Maybe we can do that instead of playing with threshold for now.
> >
> > The question here, how much wasted space we can afford? since indirect
> > descriptors would have to be the same size we'd have a bunch of them
> > wasted in the cache. Ofcourse we can make that configurable, but how
> > much is ok by default?
>
> I think it's a good idea to make that per-device.
> For network at least, each skb already has overhead of
> around 1/2 K, so using up to 1/2K more seems acceptable.
> But even if we went up to MAX_SKB_FRAGS+2, it would be
> only 1K per ring entry,

I got this wrong - descriptor is 16 bytes, so MAX_SKB_FRAGS+2
descriptors would be less than 300 bytes overhead per packet.
That's not a lot.

> so for a ring of 256 entries, we end up with
> 256K max waste. That's not that terrible.
>
> But I'd say let's do some benchmarking to figure out
> the point where the gains are becoming very small.


> > --
> >
> > Sasha.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/