Re: [PATCH 2/4] PCI/VGA: Deal only with PCI VGA class devices

From: Bjorn Helgaas
Date: Tue Jul 18 2023 - 19:14:07 EST


On Fri, Jun 30, 2023 at 06:17:29PM +0800, Sui Jingfeng wrote:
> From: Sui Jingfeng <suijingfeng@xxxxxxxxxxx>
>
> VGAARB should only care about PCI VGA class devices (pdev->class == 0x0300)
> since only those devices might have VGA routed to them.

This is not actually a question of whether VGA addresses (mem
0xa0000-0xbffff and io 0x3b0-0x3bb, 0x3c0-0x3df) might be *routed* to
the device because that routing is controlled by the bridge VGA Enable
bit, not by a device Class Code.

I think the important question here is what devices will *respond* to
those VGA addresses. The VGA arbiter works by managing bridge VGA
Enable bits, so if we know a device doesn't respond to the VGA
addresses, there's no point in adding a vga_device for it.

> PCI_CLASS_DISPLAY_3D and PCI_CLASS_DISPLAY_OTHER are used to annotate the
> render-only GPU. Render-only GPUs shouldn't decode the fixed VGA address.
> For example, nvidia render-only GPU typically has 0x0380 as its PCI class.
>
> A render-only GPU cannot be used to display something on the screen.
> Hence, it should not be the default boot device in normal cases.

Can you make the commit log say specifically what changes with this
patch? Is the idea that we previously added GPUs with Class Codes
like 0x0380, and after this patch we will only add GPUs that exactly
match 0x0300?

It doesn't *look* like that's the case because
vga_arbiter_add_pci_device() previously had:

if ((pdev->class >> 8) != PCI_CLASS_DISPLAY_VGA)
return false;

This ignores the programming interface (the low byte) but still
matches only base class 0x03 and subclass 0x00, so it shouldn't add a
0x0380 GPU.

This patch matches the entire 24-bit dev->class (base class, subclass,
and programming interface) against PCI_CLASS_DISPLAY_VGA << 8, so I
*think* this only accepts programming interface 0x00.

That might be OK, since the "PCI Code and ID Assignment" spec, r1.15,
sec 1.4, only mentions 0x0300 programming interface 0x00 as decoding
the legacy VGA addresses. But it is something the commit log should
be clear about.

> Cc: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
> Reviewed-by: Mario Limonciello <mario.limonciello@xxxxxxx>
> Signed-off-by: Sui Jingfeng <suijingfeng@xxxxxxxxxxx>
> ---
> drivers/pci/vgaarb.c | 22 ++++++++++++----------
> 1 file changed, 12 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
> index c1bc6c983932..22a505e877dc 100644
> --- a/drivers/pci/vgaarb.c
> +++ b/drivers/pci/vgaarb.c
> @@ -754,10 +754,6 @@ static bool vga_arbiter_add_pci_device(struct pci_dev *pdev)
> struct pci_dev *bridge;
> u16 cmd;
>
> - /* Only deal with VGA class devices */
> - if ((pdev->class >> 8) != PCI_CLASS_DISPLAY_VGA)
> - return false;
> -
> /* Allocate structure */
> vgadev = kzalloc(sizeof(struct vga_device), GFP_KERNEL);
> if (vgadev == NULL) {
> @@ -1500,7 +1496,9 @@ static int pci_notify(struct notifier_block *nb, unsigned long action,
> struct pci_dev *pdev = to_pci_dev(dev);
> bool notify = false;
>
> - vgaarb_dbg(dev, "%s\n", __func__);
> + /* Only deal with VGA class devices */
> + if (pdev->class != PCI_CLASS_DISPLAY_VGA << 8)
> + return 0;
>
> /* For now we're only intereted in devices added and removed. I didn't
> * test this thing here, so someone needs to double check for the
> @@ -1510,6 +1508,8 @@ static int pci_notify(struct notifier_block *nb, unsigned long action,
> else if (action == BUS_NOTIFY_DEL_DEVICE)
> notify = vga_arbiter_del_pci_device(pdev);
>
> + vgaarb_dbg(dev, "%s: action = %lu\n", __func__, action);
> +
> if (notify)
> vga_arbiter_notify_clients();
> return 0;
> @@ -1534,8 +1534,8 @@ static struct miscdevice vga_arb_device = {
>
> static int __init vga_arb_device_init(void)
> {
> + struct pci_dev *pdev = NULL;
> int rc;
> - struct pci_dev *pdev;
>
> rc = misc_register(&vga_arb_device);
> if (rc < 0)
> @@ -1545,11 +1545,13 @@ static int __init vga_arb_device_init(void)
>
> /* We add all PCI devices satisfying VGA class in the arbiter by
> * default */
> - pdev = NULL;
> - while ((pdev =
> - pci_get_subsys(PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
> - PCI_ANY_ID, pdev)) != NULL)
> + while (1) {
> + pdev = pci_get_class(PCI_CLASS_DISPLAY_VGA << 8, pdev);
> + if (!pdev)
> + break;
> +
> vga_arbiter_add_pci_device(pdev);
> + }
>
> pr_info("loaded\n");
> return rc;
> --
> 2.25.1
>