Re: Question on MSI support in PCI and PCI-E devices

From: Roger Heflin
Date: Wed Mar 04 2015 - 11:31:24 EST


I know from some data I have seen that between the Intel Sandy Bridge
and Intel Ivy Bridge the same motherboards stopped delivering INTx
reliably (int lost under load around 1x every 30 days, driver and
firmware has no method to recover from failure) We had to transition
to using MSI on some PCI cards that had this issue. Our issue was
duplicated on a large number of different physical machines so if it
was a hardware error is was a lot of different physical machines that
had the defect.

On Wed, Mar 4, 2015 at 10:03 AM, McKay, Luke <Luke.McKay@xxxxxxxxxxxx> wrote:
> I don't personally know of any PCI drivers that use polling instead of interrupts, since that would really mean the hardware is broke.
>
> Basically all you need to do is create a timer, and have it's callback set to your driver routine that can check the device status registers to determine if there is work to be done. The status register(s) would be the same indicators that should have generated an interrupt.
>
> Regards,
> Luke
>
>
> --
> Luke McKay
> Senior Engineer
> Cobham AvComm
> T : +1 (316) 529 5585
>
> Please consider the environment before printing this email
>
>
>
> -----Original Message-----
> From: Andrey Utkin [mailto:andrey.utkin@xxxxxxxxxxxxxxxxxxx]
> Sent: Tuesday, March 03, 2015 8:29 AM
> To: McKay, Luke
> Cc: Andrey Utkin; Stephen Hemminger; kernel-mentors@xxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; kernelnewbies
> Subject: Re: Question on MSI support in PCI and PCI-E devices
>
> On Mon, Mar 2, 2015 at 4:02 PM, McKay, Luke <Luke.McKay@xxxxxxxxxxxx> wrote:
>> It doesn't appear that your device supports MSI. If it did lspci -v should list the MSI capability and whether or not it is enabled.
>>
>> i.e. Something like...
>> Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
>>
>> Without a listing that shows the capability is present, there is nothing to enable.
>>
>> Have you tried polling instead of using interrupts? Definitely not ideal, but it might help you to determine whether hardware is dropping/missing an interrupt or whether the hardware is being completely hung up.
>>
>> Do you know if this missing interrupt is occurring in other systems as well? How about whether it happens with different boards in the same system? Answers to these questions would help to determine whether you might have a defective board, or some sort of incompatibility with the system.
>
> We have just three setups reproducing this. We have no boards for replacement experiments, unfortunately.
> Polling instead of using interrupts sounds interesting. Is there an example of such usage in any other PCI device driver?
>
> --
> Bluecherry developer.
>
>
> Aeroflex is now a Cobham company
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/