Re: [RFC PATCH v2 1/1] platform-msi: Add platform check for subdevice irq domain

From: Lu Baolu
Date: Tue Jan 12 2021 - 00:27:09 EST


Hi,

On 1/7/21 3:16 PM, Leon Romanovsky wrote:
On Thu, Jan 07, 2021 at 06:55:16AM +0000, Tian, Kevin wrote:
From: Leon Romanovsky <leon@xxxxxxxxxx>
Sent: Thursday, January 7, 2021 2:09 PM

On Thu, Jan 07, 2021 at 02:04:29AM +0000, Tian, Kevin wrote:
From: Leon Romanovsky <leon@xxxxxxxxxx>
Sent: Thursday, January 7, 2021 12:02 AM

On Wed, Jan 06, 2021 at 11:23:39AM -0400, Jason Gunthorpe wrote:
On Wed, Jan 06, 2021 at 12:40:17PM +0200, Leon Romanovsky wrote:

I asked what will you do when QEMU will gain needed functionality?
Will you remove QEMU from this list? If yes, how such "new" kernel
will
work on old QEMU versions?

The needed functionality is some VMM hypercall, so presumably new
kernels that support calling this hypercall will be able to discover
if the VMM hypercall exists and if so superceed this entire check.

Let's not speculate, do we have well-known path?
Will such patch be taken to stable@/distros?


There are two functions introduced in this patch. One is to detect whether
running on bare metal or in a virtual machine. The other is for deciding
whether the platform supports ims. Currently the two are identical because
ims is supported only on bare metal at current stage. In the future it will
look
like below when ims can be enabled in a VM:

bool arch_support_pci_device_ims(struct pci_dev *pdev)
{
return on_bare_metal() || hypercall_irq_domain_supported();
}

The VMM vendor list is for on_bare_metal, and suppose a vendor will
never be removed once being added to the list since the fact of running
in a VM never changes, regardless of whether this hypervisor supports
extra VMM hypercalls.

This is what I imagined, this list will be forever, and this worries me.

I don't know if it is true or not, but guess that at least Oracle and
Microsoft bare metal devices and VMs will have same DMI_SYS_VENDOR.

It's true. David Woodhouse also said it's the case for Amazon EC2 instances.


It means that this on_bare_metal() function won't work reliably in many
cases. Also being part of include/linux/msi.h, at some point of time,
this function will be picked by the users outside for the non-IMS cases.

I didn't even mention custom forks of QEMU which are prohibited to change
DMI_SYS_VENDOR and private clouds with custom solutions.

In this case the private QEMU forks are encouraged to set CPUID (X86_
FEATURE_HYPERVISOR) if they do plan to adopt a different vendor name.

Does QEMU set this bit when it runs in host-passthrough CPU model?



The current array makes DMI_SYS_VENDOR interface as some sort of ABI. If
in the future,
the QEMU will decide to use more hipster name, for example "qEmU", this
function
won't work.

I'm aware that DMI_SYS_VENDOR is used heavily in the kernel code and
various names for the same company are good example how not reliable it.

The most hilarious example is "Dell/Dell Inc./Dell Inc/Dell Computer
Corporation/Dell Computer",
but other companies are not far from them.

Luckily enough, this identification is used for hardware product that
was released to the market and their name will be stable for that
specific model. It is not the case here where we need to ensure future
compatibility too (old kernel on new VM emulator).

I'm not in position to say yes or no to this patch and don't have plans to do it.
Just expressing my feeling that this solution is too hacky for my taste.


I agree with your worries and solely relying on DMI_SYS_VENDOR is
definitely too hacky. In previous discussions with Thomas there is no
elegant way to handle this situation. It has to be a heuristic approach.
First we hope the CPUID bit is set properly in most cases thus is checked
first. Then other heuristics can be made for the remaining cases. DMI_
SYS_VENDOR is the first hint and more can be added later. For example,
when IOMMU is present there is vendor specific way to detect whether
it's real or virtual. Dave also mentioned some BIOS flag to indicate a
virtual machine. Now probably the real question here is whether people
are OK with CPUID+DMI_SYS_VENDOR combo check for now (and grow
it later) or prefer to having all identified heuristics so far in-place together...

IMHO, it should be as much as possible close to the end result.

Okay! This seems to be a right way to go.

The SMBIOS defines a 'virtual machine' bit in the BIOS characteristics
extension byte. It could be used as a possible way.

In order to support emulated IOMMU for fully virtualized guest, the
iommu vendors defined methods to distinguish between bare metal and VMM
(caching mode in VT-d for example).

I will go ahead with adding above two methods before checking the block
list.

Best regards,
baolu