Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server

From: Avi Kivity
Date: Tue Sep 22 2009 - 11:57:30 EST


On 09/22/2009 06:25 PM, Ira W. Snyder wrote:

Yes. vbus is more finely layered so there is less code duplication.

The virtio layering was more or less dictated by Xen which doesn't have
shared memory (it uses grant references instead). As a matter of fact
lguest, kvm/pci, and kvm/s390 all have shared memory, as you do, so that
part is duplicated. It's probably possible to add a virtio-shmem.ko
library that people who do have shared memory can reuse.

Seems like a nice benefit of vbus.

Yes, it is. With some work virtio can gain that too (virtio-shmem.ko).

I've given it some thought, and I think that running vhost-net (or
similar) on the ppc boards, with virtio-net on the x86 crate server will
work. The virtio-ring abstraction is almost good enough to work for this
situation, but I had to re-invent it to work with my boards.

I've exposed a 16K region of memory as PCI BAR1 from my ppc board.
Remember that this is the "host" system. I used each 4K block as a
"device descriptor" which contains:

1) the type of device, config space, etc. for virtio
2) the "desc" table (virtio memory descriptors, see virtio-ring)
3) the "avail" table (available entries in the desc table)

Won't access from x86 be slow to this memory (on the other hand, if you
change it to main memory access from ppc will be slow... really depends
on how your system is tuned.

Writes across the bus are fast, reads across the bus are slow. These are
just the descriptor tables for memory buffers, not the physical memory
buffers themselves.

These only need to be written by the guest (x86), and read by the host
(ppc). The host never changes the tables, so we can cache a copy in the
guest, for a fast detach_buf() implementation (see virtio-ring, which
I'm copying the design from).

The only accesses are writes across the PCI bus. There is never a need
to do a read (except for slow-path configuration).

Okay, sounds like what you're doing it optimal then.

In the spirit of "post early and often", I'm making my code available,
that's all. I'm asking anyone interested for some review, before I have
to re-code this for about the fifth time now. I'm trying to avoid
Haskins' situation, where he's invented and debugged a lot of new code,
and then been told to do it completely differently.

Yes, the code I posted is only compile-tested, because quite a lot of
code (kernel and userspace) must be working before anything works at
all. I hate to design the whole thing, then be told that something
fundamental about it is wrong, and have to completely re-write it.

Understood. Best to get a review from Rusty then.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/