Re: [PATCH 1/2] ahci: Add Green Sardine vendor ID as board_ahci_mobile

From: Hans de Goede
Date: Tue Feb 15 2022 - 07:52:41 EST


Hi Paul,

On 2/15/22 13:27, Paul Menzel wrote:
> Dear Hans,
>
>
> Am 15.02.22 um 12:43 schrieb Hans de Goede:
>
>> On 2/15/22 08:05, Paul Menzel wrote:
>
>>> Am 14.02.22 um 17:07 schrieb Limonciello, Mario:
>>>
>>>> +Hans
>>>>
>>>>> (For the records, is part of Linux since 5.16-rc2 (commit 1527f69204fe).)
>>>>>
>>>>> Am 12.11.21 um 21:15 schrieb Mario Limonciello:
>>>>>> AMD requires that the SATA controller be configured for devsleep in order
>>>>>> for S0i3 entry to work properly.
>>>>>>
>>>>>> commit b1a9585cc396 ("ata: ahci: Enable DEVSLP by default on x86 with
>>>>>> SLP_S0") sets up a kernel policy to enable devsleep on Intel mobile
>>>>>> platforms that are using s0ix.  Add the PCI ID for the SATA controller in
>>>>>> Green Sardine platforms to extend this policy by default for AMD based
>>>>>> systems using s0i3 as well.
>>>>>>
>>>>>> Cc: Nehal-bakulchandra Shah <Nehal-bakulchandra.Shah@xxxxxxx>
>>>>>> BugLink:
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugz
>>>>> illa.kernel.org%2Fshow_bug.cgi%3Fid%3D214091&amp;data=04%7C01%7Cm
>>>>> ario.limonciello%40amd.com%7Ca32a202d437544cd2cbb08d9ef9112c0%7C3d
>>>>> d8961fe4884e608e11a82d994e183d%7C0%7C0%7C637804228648002522%7CU
>>>>> nknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI
>>>>> 6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=CbfImBnwc8uV1L5QRBuV
>>>>> PLkP72wpS9yif1UbUwykhNI%3D&amp;reserved=0
>>>
>>> You have to love Microsoft Outlook.
>>>
>>>>>> Signed-off-by: Mario Limonciello <mario.limonciello@xxxxxxx>
>>>>>> ---
>>>>>>     drivers/ata/ahci.c | 1 +
>>>>>>     1 file changed, 1 insertion(+)
>>>>>>
>>>>>> diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
>>>>>> index d60f34718b5d..1e1167e725a4 100644
>>>>>> --- a/drivers/ata/ahci.c
>>>>>> +++ b/drivers/ata/ahci.c
>>>>>> @@ -438,6 +438,7 @@ static const struct pci_device_id ahci_pci_tbl[] = {
>>>>>>         /* AMD */
>>>>>>         { PCI_VDEVICE(AMD, 0x7800), board_ahci }, /* AMD Hudson-2 */
>>>>>>         { PCI_VDEVICE(AMD, 0x7900), board_ahci }, /* AMD CZ */
>>>>>> +    { PCI_VDEVICE(AMD, 0x7901), board_ahci_mobile }, /* AMD Green Sardine */
>>>>>
>>>>> Aren't 0x7900 and 0x7901 the same device only in different modes? I
>>>>> wonder, why different boards and different comments are used.
>>>>
>>>> No they aren't the same device in different modes.
>>>>
>>>> Reference:
>>>> https://www.amd.com/en/support/tech-docs/processor-programming-reference-ppr-for-amd-family-19h-model-51h-revision-a1
>>>> Page 33 has a table.
>>>
>>> That table misses 0x7900h. (Where can I find it?) coreboot has 0x7900 defined for Cezanne:
>>>
>>>      src/include/device/pci_ids.h:#define PCI_DEVICE_ID_AMD_CZ_SATA  0x7900
>>>      src/soc/amd/stoneyridge/include/soc/pci_devs.h: * SATA_IDE_IDEVID              0x7900
>>>
>>> The PCI database too [3]:
>>>
>>>> FCH SATA Controller [IDE mode]
>>>
>>>
>>>>> Additionally, the device 0x7901 is also present in desktop systems like
>>>>> Dell OptiPlex 5055 and MSI B350 MORTAR. Is `board_ahci_mobile` the right
>>>>> board then? Or should the flag `AHCI_HFLAG_IS_MOBILE` be renamed to
>>>>> avoid confusion?
>>>>
>>>> Are you having a problem or just want clarity in the enum definition?
>>>
>>> It’s more about clarity, and understanding the two different entries.
>>>
>>>> This was introduced by Hans in commit ebb82e3c79d2a to set LPM policy properly
>>>> for a number of mobile devices.
>>>>
>>>> My opinion here is that the policy being for "mobile" devices only is short sighted as power
>>>> consumption policy on desktops is also relevant as OEMs ship desktops that need to meet
>>>> various power regulations for those too.
>>>>
>>>> So I would be in the camp for renaming the flags.
>>>
>>> Why can’t the LPM policy, if available(?), not be set for `board_ahci` by default, so `board_ahci_mobile` could be removed?
>>
>> When I added this, which was around the Haswell / Broadwell times, enabling
>> LPM on mobile devices was not so much important for the direct power-saving,
>> but to allow the entire package (integrated southbridge-ish) to go to deeper
>> sleep (aka PC) states.
>>
>> Without this laptops would drain their batteries much faster, but at the same
>> time there were reports of LPM causing crashes and data corruption on some
>> systems, also see the list of ATA ids with a ATA_HORKAGE_NOLPM flag in
>> drivers/ata/libata-core.c . Which was added and grown over time to allow
>> enabling LPM by default without causing regressions.
>
> This seems to be related to the hard driver though, and not the chipset?

It is related to both, as mentioned sometimes BIOS updates (presumably
involving SATA chipset settings tweaks) resolved LPM issues.

A drive having the ATA_HORKAGE_NOLPM flag basically means that it is know
to cause (serious) issues with at least one of the mobile marked chipsets.

For another example of sometimes issues being chipset specific see the
recently added ATA_HORKAGE_NO_NCQ_ON_ATI quirk, which fixes a bad
interaction between early AMD/ATI sata chihpsets and Samsung SSD 860
and Samsung SSD 870 series drives.

>> So when adding support for enabling LPM modes by default, to get the
>> desired power-savings by default I tried to do this on a minimal set of
>> devices, to avoid causing regressions.
>>
>> Another factor here is that in some cases LPM related issues went away
>> after either BIOS or disk/ssd firmware updates. So the motherboard firmware
>> is also a factor here and enabling LPM by default on non laptop(ish)
>> motherboards has never been tested.
>>
>> Also enabling some of the deeper LPM modes comes at a latency cost which
>> may be undesirable at servers. Note this just sets the initial default
>> LPM mode, this can always be changed by userspace later.
>
> Thank you for the elaborate explanation. What do you suggest in regards the 1022:7901, present in laptops and desktop systems?

I would suggest to go with board_ahci_mobile as is done in Mario's
original patch. We really want this for laptops and for desktops
using the same chipset I don't expect this to cause any issues.

Regards,

Hans