Re: Regression caused by commit 7bb05b85bc2d ("r8169: don't use MSI-X on RTL8106e")

From: Jian-Hong Pan
Date: Thu Sep 13 2018 - 01:51:45 EST


2018-09-12 16:19 GMT+08:00 Kai-Heng Feng <kai.heng.feng@xxxxxxxxxxxxx>:
> at 14:32, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
>> On Wed, 12 Sep 2018, Kai-Heng Feng wrote:
>>
>>> There's a Dell machine with RTL8106e stops to work after S3 since the
>>> commit introduced. So I am wondering if it's possible to revert the
>>> commit and use DMI/subsystem id based quirk table?
>>
>>
>> Probably.
>
>
> Hopefully Jian-Hong can cook up a quirk table for the issue.

Module r8169 gets nothing in the PCI BAR after system resumes which
makes MSI-X fail on some ASUS laptops equipped with RTL8106e chip.
https://www.spinics.net/lists/linux-pci/msg75598.html

Actually, I am waiting for the patch "PCI: Reprogram bridge prefetch
registers on resume" being merged.
https://marc.info/?l=linux-pm&m=153680987814299&w=2

It resolves the drivers which get nothing in PCI BAR after system resumes.

After that, I can remove the falling back code of RTL8106e.

Heiner, any comment?

Regards,
Jian-Hong Pan

>>
>>> It's because of commit bc976233a872 ("genirq/msi, x86/vector: Prevent
>>> reservation mode for non maskable MSI") cleared the reservation mode, and
>>> I
>>> can see this after S3:
>>>
>>> [ 94.872838] do_IRQ: 3.33 No irq handler for vector
>>
>>
>> It's not because of that commit, really. There is a interrupt sent after
>> resume to the wrong vector for whatever reason. The MSI vector cannot be
>> masked it seems in the device, but the driver should quiescen the device
>> to
>> a point where it does not send interrupts.
>
>
> Understood.
>
>>
>>> If the device uses MSI-X instead of MSI, the issue doesn't happen because
>>> of
>>> reservation mode.
>>
>>
>> Reservation mode has absolutely nothing to do with that. What prevents the
>> issue is the fact that MSI-X can be masked by the IRQ core.
>
>
> So in this case I think keep the device using MSI-X is a better route, it's
> MSI-X capable anyway.
>
>>
>>> Is it something should be handled by x86 BIOS? Because I don't see this
>>> issue
>>> when I use Suspend-to-Idle, which doesn't use BIOS to do suspend.
>>
>>
>> Suspend to idle works completely different and I don't see the BIOS at
>> fault here. it's more an issue of MSI not being maskable on that device,
>> which can't be fixed in BIOS or it's some half quiescened state which is
>> used when suspending and that's a pure driver issue.
>
>
> Understood.
> Thanks for all the info!
>
> Kai-Heng
>
>>
>> Thanks,
>>
>> tglx
>
>
>