Like I said, I am not an expert on the details here. I only work on theThis is for things like the setup of queue-pairs, and the transport ofThat's not a full bypass, then. AFAIK kernel bypass has userspace
door-bells, and ib-verbs. I am not on the team doing that work, so I am
not an expert in this area. What I do know is having a flexible and
low-latency signal-path was deemed a key requirement.
talking directly to the device.
vbus plumbing. FWIW, the work is derivative from the "Xen-IB" project
http://www.openib.org/archives/nov2006sc/xen-ib-presentation.pdf
There were issues with getting Xen-IB to map well into the Xen model.
Vbus was specifically designed to address some of those short-comings.
This is best done using cr8/tpr so you don't have to exit at all. SeeYou can think of vTPR as a good model, yes. Generally, you can't
also my vtpr support for Windows which does this in software, generally
avoiding the exit even when lowering priority.
actually use it for our purposes for several reasons, however:
1) the prio granularity is too coarse (16 levels, -rt has 100)
2) it is too scope limited (it covers only interrupts, we need to have
additional considerations, like nested guest/host scheduling algorithms
against the vcpu, and prio-remap policies)
3) I use "priority" generally..there may be other non-priority based
policies that need to add state to the table (such as EDF deadlines, etc).
but, otherwise, the idea is the same. Besides, this was one example.
This is where the really fast call() type mechanism is important.Generally cpu state shouldn't flow through a device but rather through
Its also about having the priority flow-end to end, and having the vcpu
interrupt state affect the task-priority, etc (e.g. pending interrupts
affect the vcpu task prio).
etc, etc.
I can go on and on (as you know ;), but will wait till this work is more
concrete and proven.
MSRs, hypercalls, and cpu registers.
Well, you can blame yourself for that one ;)
The original vbus was implemented as cpuid+hypercalls, partly for that
reason. You kicked me out of kvm.ko, so I had to make due with plan B
via a less direct PCI-BRIDGE route.
But in reality, it doesn't matter much. You can certainly have "system"
devices sitting on vbus that fit a similar role as "MSRs", so the access
method is more of an implementation detail. The key is it needs to be
fast, and optimize out extraneous exits when possible.
Well, do you plan to address this before submission for inclusion?Maybe, maybe not. Its workable for now (i.e. run as root), so its
inclusion is not predicated on the availability of the fix, per se (at
least IMHO). If I can get it working before I get to pushing the core,
great! Patches welcome.
For the time being, windows will not be RT, and windows can fall-back to
use virtio-net, etc. So I am ok with this. It will come in due time.
Not the time-to-complete-setup overhead. The residual costs, likeThe point is: the things we build on top have costs associated withDo you mean minimizing the setup cost? Seriously?
them, and I aim to minimize it. For instance, to do a "call()" kind of
interface, you generally need to pre-setup some per-cpu mappings so that
you can just do a single iowrite32() to kick the call off. Those
per-cpu mappings have a cost if you want them to be high-performance, so
my argument is that you ideally want to limit the number of times you
have to do this. My current design reduces this to "once".
heap/vmap usage at run-time. You generally have to set up per-cpu
mappings to gain maximum performance. You would need it per-device, I
do it per-system. Its not a big deal in the grand-scheme of things,
really. But chalk that up as an advantage to my approach over yours,
nonetheless.
I guess it isn't that important then. I note that clever prioritizationI answer this below...
in a guest is pointless if you can't do the same prioritization in the
host.
The point is that I am eliminating as many exits as possible, so 1us,
2us, whatever...it doesn't matter. The fastest exit is the one you
don't have to take.
IIRC we reuse the PCI IDs for non-PCI.
You already know how I feel about this gem.
I'm not okay with it. If you wish people to adopt vbus over virtioBy building a community around the development of vbus, isnt this what I
you'll have to address all concerns, not just yours.
am doing? Working towards making it usable for all?
I don't think so, no. Perhaps I misspoke or was misunderstood. IYou said you aren't interested in it previously IIRC.and multiqueue out of your design.AFAICT, multiqueue should work quite nicely with vbus. Can you
elaborate on where you see the problem?
actually think its a good idea and will be looking to do this.
I agree that it isn't very clever (not that I am a real time expert) butIts more about task priority in the case of real-time. We do stuff with
I disagree about dismissing Linux support so easily. If prioritization
is such a win it should be a win on the host as well and we should make
it work on the host as well. Further I don't see how priorities on the
guest can work if they don't on the host.
802.1p as well for control messages, etc. But for the most part, this
is an orthogonal effort. And yes, you are right, it would be nice to
have this interrupt classification capability in the host.
Generally this is mitigated by the use of irq-threads. You could argue
that if irq-threads help the host without a prioritized interrupt
controller, why cant the guest? The answer is simply that the host can
afford sub-optimal behavior w.r.t. IDT injection here, where the guest
cannot (due to the disparity of hw-injection vs guest-injection overheads).
They had to write 414 lines in drivers/s390/kvm/kvm_virtio.c andWell, then I retract that statement. I think the small amount of code
something similar for lguest.
is probably because they are re-using the qemu device-models, however.
Note that I am essentially advocating the same basic idea here.
I don't see what vbus adds to virtio-net.Well, as you stated in your last reply, you don't want it. So I guess
that doesn't matter much at this point. I will continue developing
vbus, and pushing things your way. You can opt to accept or reject
those things at your own discretion.