On Mon, Apr 16, 2018 at 10:22 AM, Oleksandr AndrushchenkoProbably the confusion comes from the fact that KVM and Xen
<andr2000@xxxxxxxxx> wrote:
On 04/16/2018 10:43 AM, Daniel Vetter wrote:Why do you even do that? The copying here I mean - why don't you just
On Mon, Apr 16, 2018 at 10:16:31AM +0300, Oleksandr Andrushchenko wrote:Ok, so I'll try to explain:
On 04/13/2018 06:37 PM, Daniel Vetter wrote:Why do you need to copy on every frame for xen-front? In the above
On Wed, Apr 11, 2018 at 08:59:32AM +0300, Oleksandr Andrushchenko wrote:This is the exact use-case I was referring to while saying
On 04/10/2018 08:26 PM, Dongwon Kim wrote:Erm, why do you need importer capability for this use-case?
On Tue, Apr 10, 2018 at 09:37:53AM +0300, Oleksandr AndrushchenkoYes, I saw this. But this limits the use-cases so much.
wrote:
On 04/06/2018 09:57 PM, Dongwon Kim wrote:Another thing is danvet was kind of against to the idea of importing
On Fri, Apr 06, 2018 at 03:36:03PM +0300, Oleksandr Andrushchenko
wrote:
On 04/06/2018 02:57 PM, Gerd Hoffmann wrote:xen z-copy solution is pretty similar fundamentally to hyper_dmabuf
Hi,Thank you for your input, I'm just trying to figure out
Well, udmabuf route isn't fully clear yet, but yes.I fail to see any common ground for xen-zcopy and udmabuf ...Does the above mean you can assume that xen-zcopy and udmabuf
can co-exist as two different solutions?
See also gvt (intel vgpu), where the hypervisor interface is
abstracted
away into a separate kernel modules even though most of the actual
vgpu
emulation code is common.
which of the three z-copy solutions intersect and how much
And what about hyper-dmabuf?
in terms of these core sharing feature:
1. the sharing process - import prime/dmabuf from the producer ->
extract
underlying pages and get those shared -> return references for
shared pages
existing
dmabuf/prime buffer and forward it to the other domain due to
synchronization
issues. He proposed to make hyper_dmabuf only work as an exporter so
that it
can have a full control over the buffer. I think we need to talk about
this
further as well.
For instance, running Android as a Guest (which uses ION to allocate
buffers) means that finally HW composer will import dma-buf into
the DRM driver. Then, in case of xen-front for example, it needs to be
shared with the backend (Host side). Of course, we can change
user-space
to make xen-front allocate the buffers (make it exporter), but what we
try
to avoid is to change user-space which in normal world would have
remain
unchanged otherwise.
So, I do think we have to support this use-case and just have to
understand
the complexity.
guest1 -> ION -> xen-front -> hypervisor -> guest 2 -> xen-zcopy exposes
that dma-buf -> import to the real display hw
No where in this chain do you need xen-zcopy to be able to import a
dma-buf (within linux, it needs to import a bunch of pages from the
hypervisor).
Now if your plan is to use xen-zcopy in the guest1 instead of xen-front,
then you indeed need to import.
we need to import on Guest1 side. If hyper-dmabuf is so
generic that there is no xen-front in the picture, then
it needs to import a dma-buf, so it can be exported at Guest2 side.
But that imo doesn't make sense:xen-zcopy is a helper driver which doesn't handle page flips
- xen-front gives you clearly defined flip events you can forward to the
hypervisor. xen-zcopy would need to add that again.
and is not a KMS driver as one might think of: the DRM UAPI it uses is
just to export a dma-buf as a PRIME buffer, but that's it.
Flipping etc. is done by the backend [1], not xen-zcopy.
Same forAgain, here you are talking of something which is implemented in
hyperdmabuf (and really we're not going to shuffle struct dma_fence
over
the wire in a generic fashion between hypervisor guests).
- xen-front already has the idea of pixel format for the buffer (and any
other metadata). Again, xen-zcopy and hyperdmabuf lack that, would
need
to add it shoehorned in somehow.
Xen display backend, not xen-zcopy, e.g. display backend can
implement para-virtual display w/o xen-zcopy at all, but in this case
there is a memory copying for each frame. With the help of xen-zcopy
the backend feeds xen-front's buffers directly into Guest2 DRM/KMS or
Weston or whatever as xen-zcopy exports remote buffers as PRIME buffers,
thus no buffer copying is required.
pipeline, using xen-front I see 0 architectural reasons to have a copy
anywhere.
This seems to be the core of the confusion we're having here.
1. xen-front - produces a display buffer to be shown at Guest2
by the backend, shares its grant references with the backend
2. xen-front sends page flip event to the backend specifying the
buffer in question
3. Backend takes the shared buffer (which is only a buffer mapped into
backend's memory, it is not a dma-buf/PRIME one) and makes memcpy from
it to a local dumb/surface
directly scan out from the grant references you received through the
hypervisor?
Also I'm not clear in your example which step happens where (guest 1/2Steps 1,2 - Guest2, kernel space
or hypervisor)?
xen-front has no backend code, but only has code which allows it4. Backend flips that local dumb buffer/surfaceSo this is entirely a means to implement the virtual xen device in
If I have a xen-zcopy helper driver then I can avoid doing step 3):
1) 2) remain the same as above
3) Initially for a new display buffer, backend calls xen-zcopy to create
a local PRIME buffer from the grant references provided by the xen-front
via displif protocol [1]: we now have handle_zcopy
4) Backend exports this PRIME with HANDLE_TO_FD from xen-zcopy and imports
it into Weston-KMS/DRM or real HW DRM driver with FD_TO_HANDLE: we now have
handle_local
5) On page flip event backend flips local PRIME: uses handle_local for flips
xen-front provides an array of grant references to Guest2 (backend).Under the assumption that you can make xen-front to zerocopy for theOfc you won't be able to shovel sound or media stream data over toSo, we now know that xen-zcopy was not meant to handle page flips,
another
guest like this, but that's what you have xen-v4l and xen-sound or
whatever else for. Trying to make a new uapi, which means userspace must
be changed for all the different use-case, instead of reusing standard
linux driver uapi (which just happens to send the data to another
hypervisor guest instead of real hw) imo just doesn't make much sense.
Also, at least for the gpu subsystem: Any new uapi must have full
userspace available for it, see:
https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#open-source-userspace-requirements
Adding more uapi is definitely the most painful way to fix a use-case.
Personally I'd go as far and also change the xen-zcopy side on the
receiving guest to use some standard linux uapi. E.g. you could write an
output v4l driver to receive the frames from guest1.
but to implement new UAPI to let user-space create buffers either
from Guest2 grant references (so it can be exported to Guest1) or
other way round, e.g. create (from Guest1 grant references to export to
Guest 2). For that reason it adds 2 IOCTLs: create buffer from grefs
or produce grefs for the buffer given.
One additional IOCTL is to wait for the buffer to be released by
Guest2 user-space.
That being said, I don't quite see how v4l can be used here to implement
UAPI I need.
kernel->hypervisor path, v4l could be made to work for the
hypervisor->kernel side of the pipeline.
But it sounds like we have a confusion already on why or why not xen-front
can or cannot do zerocopy.
It's up to backend what it does with those grant references
which at Guest2 side are not PRIME or dma-buf, but just a set of pages.
This is xen-zcopy which turns these pages into a PRIME. When this is done
backend can now tell DRM drivers to use the buffer in DRM terms.
This can be done, but again, I will need some entity whichIf you need an import-export helper library, just merge it. Do not attachI was just referring to accounting stuff already implemented in the DRMHm, not sure what tracking you refer to here all ... I got lost in alldanvet, can you comment on this topic?The synchronization is done with displif protocol [1]
Yeah, understood xen-zcopy works ok with your use case. But I am just2. the page sharing mechanism - it uses Xen-grant-table.Well, this is true. And at the same time this is just a matter
And to give you a quick summary of differences as far as I
understand
between two implementations (please correct me if I am wrong,
Oleksandr.)
1. xen-zcopy is DRM specific - can import only DRM prime buffer
while hyper_dmabuf can export any dmabuf regardless of originator
of extending the API: xen-zcopy is a helper driver designed for
xen-front/back use-case, so this is why it only has DRM PRIME API
2. xen-zcopy doesn't seem to have dma-buf synchronization betweenThis is true. Again, this is because of the use-cases it covers.
two VMs
while (as danvet called it as remote dmabuf api sharing)
hyper_dmabuf sends
out synchronization message to the exporting VM for synchronization.
But having synchronization for a generic solution seems to be a good
idea.
curious
if it is ok not to have any inter-domain synchronization in this
sharing model.
The buffer being shared is technically dma-buf and originator needs toAs I am working in DRM terms the tracking is done by the DRM core
be able
to keep track of it.
for me for free. (This might be one of the reasons Daniel sees DRM
based implementation fit very good from code-reuse POV).
the
replies while catching up.
core,
so I don't have to worry about doing the same for buffers to understand
when they are released etc.
I do understand that adding new UAPI is not good for many reasons.Again, why do you want to create new uapi for this? Given the veryNot to mislead you: one grant ref is passed via displif protocol,You are right. Thanks for correction.;)3. 1-level references - when using grant-table for sharing pages,To be precise, grant ref is 4 bytes
there will
be same # of refs (each 8 byte)
ok, good to know. I will look into its implementation in more detailsas # of shared pages, which is passed toThe reason for that is that xen-zcopy is a helper driver, e.g.
the userspace to be shared with importing VM in case of xen-zcopy.
the grant references come from the display backend [1], which
implements
Xen display protocol [2]. So, effectively the backend extracts
references
from frontend's requests and passes those to xen-zcopy as an array
of refs.
ComparedIn the protocol [2] only one reference to the gref directory is
to this, hyper_dmabuf does multiple level addressing to generate
only one
reference id that represents all shared pages.
passed
between VMs
(and the gref directory is a single-linked list of shared pages
containing
all
of the grefs of the buffer).
but is
this gref directory (chained grefs) something that can be used for any
general
memory sharing use case or is it jsut for xen-display (in current code
base)?
but the page it's referencing contains the rest of the grant refs.
As to if this can be used for any memory: yes. It is the same for
sndif and displif Xen protocols, but defined twice as strictly speaking
sndif and displif are two separate protocols.
While reviewing your RFC v2 one of the comments I had [2] was that if
we
can start from defining such a generic protocol for hyper-dmabuf.
It can be a header file, which not only has the description part
(which then become a part of Documentation/...rst file), but also
defines
all the required constants for requests, responses, defines message
formats,
state diagrams etc. all at one place. Of course this protocol must not
be
Xen specific, but be OS/hypervisor agnostic.
Having that will trigger a new round of discussion, so we have it all
designed
and discussed before we start implementing.
Besides the protocol we have to design UAPI part as well and make sure
the hyper-dmabuf is not only accessible from user-space, but there will
be
number
of kernel-space users as well.
strict
requirements we have for new uapi (see above link), it's the toughest
way
to get any kind of support in.
But here I was meaning that current hyper-dmabuf design is
only user-space oriented, e.g. it provides number of IOCTLs to do all
the work. But I need a way to access the same from the kernel, so, for
example,
some other para-virtual driver can export/import dma-buf, not only
user-space.
any uapi to it, just the internal helpers.
Much, much, much easier to land.
backend may use to convert xen-front's grant references into
a PRIME buffer, hence there is UAPI for that. In other words,
I'll need a thiner xen-zcopy which will implement the same UAPI
and use that library for Xen related stuff.
The confusion may also come from the fact that the backend is
a user-space application, not a kernel module (we have 2 modes
of its operation as of now: DRM master or Weston client), so
it needs a way to talk to the kernel.
dom0 (or whichever guest implements it)?
I'm externally confused about what you mean with "backend", since
xen-front also has backend code. But that backend code lives in the
same guest os image (afaict at least), since it does direct function
calls.
Please be more specific in what you mean instead of just "backend",Hope [2] better explains this
that's really confusing.
The problem here is that the display backend then will need
But essentially we're talking about the equivalent of what qemu does
for kvm, and that's entirely not my problem. Not really a gpu
subsystem problem I think. Just talk with the xen hypervisor people
about how exactly they want to go about converting grant tables to
dma-buf, so that your virtual hw backend in userspace can make use of
it.
And then merge it somewhere in the xen directories. Since theHope the explanation above makes it all clearer.
grant tables and everything is very xen specific, I don't think
there's much point in trying to have a fake generic uapi that pretends
to work on other hypervisors, as long as they're Xen :-)
And you probably have no need for all the caching/general book-keeping
drm_prime does (it's all in userspace I guess, except for the magic
conversion from grant references to a dma_buf). So there's no point
trying to reuse code in drm_prime.c.
Also, this should make it tons easier to reuse xen-zcopy for
sound/wireless/v4l backends.
Ok, that explains. I thought your current xen-side implementation forYes, the user-space backend receives those from xen-front via [1]So these grant references, are those userspace visible things?That's why I had essentially zero big questions for xen-front (exceptOn my side there are only 2 use-cases, Guest2 only:
some
implementation improvements, and stuff to make sure xen-front actually
implements the real uapi semantics instead of its own), and why I'm
asking
much more questions on this stuff here.
Imo code reuse is overrated. Adding new uapi is what freaks me out hereWell, I would love we reuse the code that I have, but I alsoI think it is possible to use your functions for memory sharing part4. inter VM messaging (hype_dmabuf only) - hyper_dmabuf has inter-vmThis is true, xen-zcopy has no means for inter VM sync and meta-data,
msg
communication defined for dmabuf synchronization and private data
(meta
info that Matt Roper mentioned) exchange.
simply because it doesn't have any code for inter VM exchange in it,
e.g. the inter VM protocol is handled by the backend [1].
5. driver-to-driver notification (hyper_dmabuf only) - importing VMAgain, xen-zcopy is decoupled from inter VM communication
gets
notified when newdmabuf is exported from other VM - uevent can be
optionally
generated when this happens.
6. structure - hyper_dmabuf is targetting to provide a generic
solution for
inter-domain dmabuf sharing for most hypervisors, which is why it
has two
layers as mattrope mentioned, front-end that contains standard API
and backend
that is specific to hypervisor.
Thank you, I'll have a lookwe started with simple dma-buf sharing but realized there are manyNo idea, didn't look at it in detail.
Looks pretty complex from a distant view. Maybe because it tries
to
build a communication framework using dma-bufs instead of a simple
dma-buf passing mechanism.
things we need to consider in real use-case, so we added
communication
, notification and dma-buf synchronization then re-structured it to
front-end and back-end (this made things more compicated..) since
Xen
was not our only target. Also, we thought passing the reference for
the
buffer (hyper_dmabuf_id) is not secure so added uvent mechanism
later.
Yes, I am looking at it now, trying to figure out the full storyOne example is actually in github. If you want take a look at it,
and its implementation. BTW, Intel guys were about to share some
test application for hyper-dmabuf, maybe I have missed one.
It could probably better explain the use-cases and the complexity
they have in hyper-dmabuf.
please
visit:
https://github.com/downor/linux_hyper_dmabuf_test/tree/xen/simple_export
If you think of xen-zcopy as a library (which implements XenI think we can definitely collaborate. Especially, maybe we areLike xen-zcopy it seems to depend on the idea that the hypervisorSo, for xen-zcopy we were not trying to make it generic,
manages all memory it is easy for guests to share pages with the
help of
the hypervisor.
it just solves display (dumb) zero-copying use-cases for Xen.
We implemented it as a DRM helper driver because we can't see any
other use-cases as of now.
For example, we also have Xen para-virtualized sound driver, but
its buffer memory usage is not comparable to what display wants
and it works somewhat differently (e.g. there is no "frame done"
event, so one can't tell when the sound buffer can be "flipped").
At the same time, we do not use virtio-gpu, so this could probably
be one more candidate for shared dma-bufs some day.
Which simply isn't the case on kvm.Hm, I can imagine that: xen-zcopy could be a library code for
hyper-dmabuf and xen-zcopy could maybe share code, or hyper-dmabuf
build
on top of xen-zcopy.
hyper-dmabuf
in terms of implementing all that page sharing fun in multiple
directions,
e.g. Host->Guest, Guest->Host, Guest<->Guest.
But I'll let Matt and Dongwon to comment on that.
using some
outdated sharing mechanism/grant-table mechanism in our Xen backend
(thanks
for bringing that up Oleksandr). However, the question is once we
collaborate
somehow, can xen-zcopy's usecase use the standard API that
hyper_dmabuf
provides? I don't think we need different IOCTLs that do the same in
the final
solution.
grant references mangling) and DRM PRIME wrapper on top of that
library, we can probably define proper API for that library,
so both xen-zcopy and hyper-dmabuf can use it. What is more, I am
about to start upstreaming Xen para-virtualized sound device driver
soon,
which also uses similar code and gref passing mechanism [3].
(Actually, I was about to upstream drm/xen-front, drm/xen-zcopy and
snd/xen-front and then propose a Xen helper library for sharing big
buffers,
so common code of the above drivers can use the same code w/o code
duplication)
in
hyper_dmabuf's backend (this 'backend' means the layer that does page
sharing
and inter-vm communication with xen-specific way.), so why don't we
work on
"Xen helper library for sharing big buffers" first while we continue
our
discussion on the common API layer that can cover any dmabuf sharing
cases.
understand that it was limited by my use-cases. So, I do not
insist we have to ;)
If we start designing and discussing hyper-dmabuf protocol we of course
can work on this helper library in parallel.
:-)
If we end up with duplicated implementations, even in upstream, meh, not
great, but also ok. New uapi, and in a similar way, new hypervisor api
like the dma-buf forwarding that hyperdmabuf does is the kind of thing
that will lock us in for 10+ years (if we make a mistake).
Please don't start with the protocol. Instead start with the concreteSo, if you can start designing the protocol we may have a dedicatedThank you,Either way is fine with me.
Oleksandr
P.S. All, is it a good idea to move this out of udmabuf thread into a
dedicated one?
thread for that. I will try to help with the protocol as much as I can
use-cases, and then figure out why exactly you need new uapi. Once we
have
that answered, we can start thinking about fleshing out the details.
1. Create a PRIME (dma-buf) from grant references
2. Create grant references from PRIME (dma-buf)
I thoughtSo, I can map the grant references from user-space, but I won't
the grant references was just the kernel/hypervisor internal magic to make
this all work?
be able to turn those into a PRIME buffer. So, the only use of those
w/o xen-zcopy is to map grant refs and copy into real HW dumb on every page
flip.
xen-front is already making all that stuff happen. But I'm still not
sure given all the confusing talk about back-end we have in these
threads (hyperdmabuf people also talked about different backends for
different hypervisors, I guess that's a different kind of backend?).
-Daniel[1] https://wiki.xen.org/wiki/Paravirtualization_(PV)