Re: [PATCH 2/2] vfio/pci: Remove console drivers

From: Thomas Zimmermann
Date: Mon Dec 05 2022 - 05:12:11 EST


Hi

Am 05.12.22 um 10:32 schrieb mb@xxxxxxx:
I have a rtx 3070 and a 3090, I am absolutely sure I am binding vfio-pci to the 3090 and not the 3070.

I have bound the driver in two different ways, first by passing the IDs to the module and alternatively by manipulating the system interface and use the override (this is what I originally had to do when I used two 1080s, so I know it works).

While the 3090 doesn't show a console, there's a remnant from the refund (and grub previously) there.

The assessment Alex made previously, where aperture_remove_conflicting_pci_devices() is removing the driver (EFIFB) instead of the device seems correct, but it could also can be a quirky of how EFIFB is implemented. I recall reading a long time ago that EFIFB is a special device and once it detects changes it would simply give up. There was also no way to attach a device to it again as it depends on being preloaded outside the kernel; once something takes over the buffer reinitializing is "impossible". I never went deeper to try and understand it.

We recently reworked fbdev's interaction with the aperture helpers. [1] All devices should now be removed iff the driver has been bound to it (which should be the case here) The patches went into an v6.1-rc.

Could you try the most recent v6.1-rc and report if this fixes the problem?

Best regards
Thomas

[1] https://patchwork.freedesktop.org/series/106040/



On Mon, Dec 5, 2022, 2:00 AM Thomas Zimmermann <tzimmermann@xxxxxxx <mailto:tzimmermann@xxxxxxx>> wrote:

Hi

Am 05.12.22 um 01:51 schrieb Alex Williamson:
> On Sat, 3 Dec 2022 17:12:38 -0700
> "mb@xxxxxxx" <mb@xxxxxxx> wrote:
>
>> Hi,
>>
>> I hope it is ok to reply to this old thread.
>
> It is, but the only relic of the thread is the subject.  For
reference,
> the latest version of this posted is here:
>
>
https://lore.kernel.org/all/20220622140134.12763-4-tzimmermann@xxxxxxx/ <https://lore.kernel.org/all/20220622140134.12763-4-tzimmermann@xxxxxxx/>
>
> Which is committed as:
>
> d17378062079 ("vfio/pci: Remove console drivers")
>
>> Unfortunately, I found a
>> problem only now after upgrading to 6.0.
>>
>> My setup has multiple GPUs (2), and I depend on EFIFB to have a
working console.

Which GPUs do you have?

>> pre-patch behavior, when I bind the vfio-pci to my secondary GPU
both
>> the passthrough and the EFIFB keep working fine.
>> post-patch behavior, when I bind the vfio-pci to the secondary GPU,
>> the EFIFB disappears from the system, binding the console to the
>> "dummy console".

The efifb would likely use the first GPU. And vfio-pci should only
remove the generic driver from the second device. Are you sure that
you're not somehow using the first GPU with vfio-pci.

>> Whenever you try to access the terminal, you have the screen
stuck in
>> whatever was the last buffer content, which gives the impression of
>> "freezing," but I can still type.
>> Everything else works, including the passthrough.
>
> This sounds like the call to
aperture_remove_conflicting_pci_devices()
> is removing the conflicting driver itself rather than removing the
> device from the driver.  Is it not possible to unbind the GPU from
> efifb before binding the GPU to vfio-pci to effectively nullify the
> added call?
>
>> I can only think about a few options:
>>
>> - Is there a way to have EFIFB show up again? After all it looks
like
>> the kernel has just abandoned it, but the buffer is still there. I
>> can't find a single message about the secondary card and EFIFB in
>> dmesg, but there's a message for the primary card and EFIFB.
>> - Can we have a boolean controlling the behavior of vfio-pci
>> altogether or at least controlling the behavior of vfio-pci for that
>> specific ID? I know there's already some option for vfio-pci and VGA
>> cards, would it be appropriate to attach this behavior to that
option?
>
> I suppose we could have an opt-out module option on vfio-pci to skip
> the above call, but clearly it would be better if things worked by
> default.  We cannot make full use of GPUs with vfio-pci if they're
> still in use by host console drivers.  The intention was certainly to
> unbind the device from any low level drivers rather than disable
use of
> a console driver entirely.  DRM/GPU folks, is that possibly an
> interface we could implement?  Thanks,

When vfio-pci gives the GPU device to the guest, which driver driver is
bound to it?

Best regards
Thomas

>
> Alex
>

-- Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev

Attachment: OpenPGP_signature
Description: OpenPGP digital signature