Re: [PATCH v2] igc: Ignore AER reset when device is suspended

From: Vinicius Costa Gomes
Date: Fri Jul 14 2023 - 16:36:02 EST


Bjorn Helgaas <helgaas@xxxxxxxxxx> writes:

> On Fri, Jul 14, 2023 at 01:05:41PM +0800, Kai-Heng Feng wrote:
>> When a system that connects to a Thunderbolt dock equipped with I225,
>> like HP Thunderbolt Dock G4, I225 stops working after S3 resume:
>> ...
>
>> The issue is that the PTM requests are sending before driver resumes the
>> device. Since the issue can also be observed on Windows, it's quite
>> likely a firmware/hardware limitation.
>
> Does this mean we didn't disable PTM correctly on suspend? Or is the
> device defective and sending PTM requests even though PTM is disabled?
>

The way I understand the hardware bug, the device is defective, as you
said, the device sends PTM messages when "busmastering" is disabled.

> If the latter, I vote for a quirk that just disables PTM completely
> for this device.
>

My suggestion is that adding this quirk would be a last resort kind of
thing. There are users/customers that depend on the increased time
synchronization accuracy that PTM provides.

> This check in .error_detected() looks out of place to me because
> there's no connection between AER and PTM, there's no connection
> between PTM and the device being enabled, and the connection between
> the device being enabled and being fully resumed is a little tenuous.
>
> If we must do it this way, maybe add a comment about *why* we're
> checking pci_is_enabled(). Otherwise this will be copied to other
> drivers that don't need it.

Makes total sense, from my side.

>
>> So avoid resetting the device if it's not resumed. Once the device is
>> fully resumed, the device can work normally.
>>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216850
>> Reviewed-by: Guilherme G. Piccoli <gpiccoli@xxxxxxxxxx>
>> Acked-by: Vinicius Costa Gomes <vinicius.gomes@xxxxxxxxx>
>> Signed-off-by: Kai-Heng Feng <kai.heng.feng@xxxxxxxxxxxxx>
>>
>> ---
>> v2:
>> - Fix typo.
>> - Mention the product name.
>>
>> drivers/net/ethernet/intel/igc/igc_main.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
>> index 9f93f0f4f752..8c36bbe5e428 100644
>> --- a/drivers/net/ethernet/intel/igc/igc_main.c
>> +++ b/drivers/net/ethernet/intel/igc/igc_main.c
>> @@ -7115,6 +7115,9 @@ static pci_ers_result_t igc_io_error_detected(struct pci_dev *pdev,
>> struct net_device *netdev = pci_get_drvdata(pdev);
>> struct igc_adapter *adapter = netdev_priv(netdev);
>>
>> + if (!pci_is_enabled(pdev))
>> + return 0;
>> +
>> netif_device_detach(netdev);
>>
>> if (state == pci_channel_io_perm_failure)
>> --
>> 2.34.1
>>
>

--
Vinicius