Re: [PATCH v3 4/9] PCI/VGA: Improve the default VGA device selection

From: Bjorn Helgaas
Date: Tue Jul 25 2023 - 17:31:03 EST


On Mon, Jul 24, 2023 at 08:16:18PM +0800, suijingfeng wrote:
> On 2023/7/20 03:32, Bjorn Helgaas wrote:
> > > 2) It does not take the PCI Bar may get relocated into consideration.
> > > 3) It is not effective for the PCI device without a dedicated VRAM Bar.
> > > 4) It is device-agnostic, thus it has to waste the effort to iterate all
> > > of the PCI Bar to find the VRAM aperture.
> > > 5) It has invented lots of methods to determine which one is the default
> > > boot device, but this is still a policy because it doesn't give the
> > > user a choice to override.
> > I don't think we need a list of*potential* problems. We need an
> > example of the specific problem this will solve, i.e., what currently
> > does not work?
>
>
> This version do allow the arbitration service works on non-x86 arch,
> which also allow me remove a arch-specific workaround.
> I will give more detail at the next version.

Yes. This part I think we want.

> But I want to provide one more drawback of vgaarb here:
>
> (6) It does not works for non VGA-compatible PCI(e) display controllers.
>
> Because, currently, vgaarb deal with PCI VGA compatible devices only.
>
> See another my patch set [1] for more elaborate discussion.
>
> It also ignore PCI_CLASS_NOT_DEFINED_VGA as Maciej puts it[2].
>
> While my approach do not required the display controller to be
> VGA-compatible to enjoy the arbitration service.

I think vgaarb is really only for dealing with the problem of the
legacy VGA address space routing. For example, there may be VGA
devices that require the [pci 0xa0000-0xbffff] range but they don't
describe that via a BAR. There may also be VGA option ROMs that
depend on that range so they can initialize the device.

The [pci 0xa0000-0xbffff] range can only be routed to one device at a
time, and vgaarb is what takes care of that by manipulating the VGA
Enable bits in bridges.

I don't think we should extend vgaarb to deal with non-VGA GPUs in
general, i.e., I don't think it should be concerned with devices and
option ROMs that do not require the [pci 0xa0000-0xbffff] range.

I think a strict reading of the PCI Class Code spec would be that only
devices with Programming Interface 0000 0000b can depend on that
legacy range.

If that's what vgaarb currently enforces, great. If it currently
deals with more than just 0000 0000b devices, and there's some value
in restricting it to only 0000 0000b, we could try that, but I would
suggest doing that in a tiny patch all by itself. Then if we trip
over a problem, it's easy to bisect and revert it.

> [1] https://patchwork.freedesktop.org/patch/546690/?series=120548&rev=1
>
> [2] https://lkml.org/lkml/2023/6/18/315
>