Re: [PATCH 2/2] xen/virtio: Avoid use of the dom0 backend in dom0

From: Roger Pau Monné
Date: Fri Jul 07 2023 - 11:14:27 EST


On Fri, Jul 07, 2023 at 05:01:38PM +0200, Juergen Gross wrote:
> On 07.07.23 16:42, Roger Pau Monné wrote:
> > On Fri, Jul 07, 2023 at 04:10:14PM +0200, Juergen Gross wrote:
> > > On 07.07.23 11:50, Roger Pau Monné wrote:
> > > > On Fri, Jul 07, 2023 at 06:38:48AM +0200, Juergen Gross wrote:
> > > > > On 06.07.23 23:49, Stefano Stabellini wrote:
> > > > > > On Thu, 6 Jul 2023, Roger Pau Monné wrote:
> > > > > > > On Wed, Jul 05, 2023 at 03:41:10PM -0700, Stefano Stabellini wrote:
> > > > > > > > On Wed, 5 Jul 2023, Roger Pau Monné wrote:
> > > > > > > > > On Tue, Jul 04, 2023 at 08:14:59PM +0300, Oleksandr Tyshchenko wrote:
> > > > > > > > > > Part 2 (clarification):
> > > > > > > > > >
> > > > > > > > > > I think using a special config space register in the root complex would
> > > > > > > > > > not be terrible in terms of guest changes because it is easy to
> > > > > > > > > > introduce a new root complex driver in Linux and other OSes. The root
> > > > > > > > > > complex would still be ECAM compatible so the regular ECAM driver would
> > > > > > > > > > still work. A new driver would only be necessary if you want to be able
> > > > > > > > > > to access the special config space register.
> > > > > > > > >
> > > > > > > > > I'm slightly worry of this approach, we end up modifying a root
> > > > > > > > > complex emulation in order to avoid modifying a PCI device emulation
> > > > > > > > > on QEMU, not sure that's a good trade off.
> > > > > > > > >
> > > > > > > > > Note also that different architectures will likely have different root
> > > > > > > > > complex, and so you might need to modify several of them, plus then
> > > > > > > > > arrange the PCI layout correctly in order to have the proper hierarchy
> > > > > > > > > so that devices belonging to different driver domains are assigned to
> > > > > > > > > different bridges.
> > > > > > > >
> > > > > > > > I do think that adding something to the PCI conf register somewhere is
> > > > > > > > the best option because it is not dependent on ACPI and it is not
> > > > > > > > dependent on xenstore both of which are very undesirable.
> > > > > > > >
> > > > > > > > I am not sure where specifically is the best place. These are 3 ideas
> > > > > > > > we came up with:
> > > > > > > > 1. PCI root complex
> > > > > > > > 2. a register on the device itself
> > > > > > > > 3. a new capability of the device
> > > > > > > > 4. add one extra dummy PCI device for the sole purpose of exposing the
> > > > > > > > grants capability
> > > > > > > >
> > > > > > > >
> > > > > > > > Looking at the spec, there is a way to add a vendor-specific capability
> > > > > > > > (cap_vndr = 0x9). Could we use that? It doesn't look like it is used
> > > > > > > > today, Linux doesn't parse it.
> > > > > > >
> > > > > > > I did wonder the same from a quick look at the spec. There's however
> > > > > > > a text in the specification that says:
> > > > > > >
> > > > > > > "The driver SHOULD NOT use the Vendor data capability except for
> > > > > > > debugging and reporting purposes."
> > > > > > >
> > > > > > > So we would at least need to change that because the capability would
> > > > > > > then be used by other purposes different than debugging and reporting.
> > > > > > >
> > > > > > > Seems like a minor adjustment, so might we worth asking upstream about
> > > > > > > their opinion, and to get a conversation started.
> > > > > >
> > > > > > Wait, wouldn't this use-case fall under "reporting" ? It is exactly what
> > > > > > we are doing, right?
> > > > >
> > > > > I'd understand "reporting" as e.g. logging, transferring statistics, ...
> > > > >
> > > > > We'd like to use it for configuration purposes.
> > > >
> > > > I've also read it that way.
> > > >
> > > > > Another idea would be to enhance the virtio IOMMU device to suit our needs:
> > > > > we could add the domid as another virtio IOMMU device capability and (for now)
> > > > > use bypass mode for all "productive" devices.
> > > >
> > > > If we have to start adding capabilties, won't it be easier to just add
> > > > it to the each device instead of adding it to virtio IOMMU. Or is the
> > > > parsing of capabilities device specific, and hence we would have to
> > > > implement such parsing for each device? I would expect some
> > > > capabilities are shared between all devices, and a Xen capability could
> > > > be one of those.
> > >
> > > Have a look at [1], which is describing the common device config layout.
> > > The problem here is that we'd need to add the domid after the queue specific
> > > data, resulting in a mess if further queue fields would be added later.
> > >
> > > We could try that, of course.
> >
> > Right, we must make it part of the standard if we modify
> > virtio_pci_common_cfg, or else newly added fields would overlap the
> > Xen specific one.
> >
> > Would it be possible to signal Xen-grants support in the
> > `device_feature` field, and then expose it from a vendor capability?
> > IOW, would it be possible to add a Xen-specific hook in the parsing of
> > virtio_pci_common_cfg that would then fetch additional data from a
> > capability?
>
> TBH, I don't know. It might require some changes in the central parsing
> logic, but this shouldn't be too hard to do.
>
> > That would likely be less intrusive than adding a new Xen-specific
> > field to virtio_pci_common_cfg while still allowing us to do Xen
> > specific configuration for all VirtIO devices.
>
> In case we want to go that route, this should be in a new "platform config"
> capability, which might be just another form of a vendor capability.

I think telling people that they will need to implement grants-v3 in
order to solve this might be too much. I would rather prefer a more
concrete solution that doesn't have so many loose ends.

Anyway, it's up to the person doing the job, but starting with "you
will have to implement grants-v3" is quite likely to deter anyone from
attempting to solve this I'm afraid.

Thanks, Roger.