Re: [PATCH] scsi: libsas: Check and update the link rate during discovery

From: John Garry
Date: Wed Nov 02 2022 - 07:56:15 EST


On 02/11/2022 10:05, Yihang Li wrote:

note: This is not discovery in which this erroneous condition occurs. Discovery is the phase in which the device is found initially.

+----------+ +----------+
| | | |
| |--- 12.0 G ---| |--- 12.0 G --- SAS disk
|initiator | | Expander |
| device |--- 12.0 G ---| |--- 12.0 G --- SAS disk
| | | |
| |--- 12.0 G ---| |--- 6.0 G --- SATA disk
| | | |
| phy0|--- 12.0 G ---| |--- 6.0 G --- SATA disk
| | | |
+----------+ +----------+

In the scenario where the expander device is connected to a wide port,
the preceding figure shows the link topology after initialization.
All physical PHYs negotiate connections at the rate of 12 Gbit, and
the expander SATA PHY negotiates connections at the rate of 6 Gbit.

We found that when the FIO was running, if phy0 was link down due to link
instability, and the link connection was reestablished after a period of
time. During the physical link disconnection, the physical PHY gradually
decreases the link rate, attempts to renegotiate the link rate and
establish the connection. This is an HW behavior. When the physical PHY
try to re-establish the link at a link rate of 3 Gbit and the physical
link is successfully established, the negotiated link rate is 3 Gbit.
Like this:

+----------+ +----------+
| | | |
| |--- 12.0 G ---| |--- 12.0 G --- SAS disk
|initiator | | Expander |
| device |--- 12.0 G ---| |--- 12.0 G --- SAS disk
| | | |
| |--- 12.0 G ---| |--- 6.0 G --- SATA disk
| | | |
| phy0|--- 3.0 G ----| |--- 6.0 G --- SATA disk
| | | |
+----------+ +----------+

SATA disk which connected to expander PHY maybe reject IO request due to
the connection setup error (OPEN_REJECT-CONNECTION RATE NOT SUPPORTED).
The log as follows:

[175712.419423] hisi_sas_v3_hw 0000:74:02.0: erroneous completion iptt=2985 task=00000000268357f1 dev id=10 exp 0x500e004aaaaaaa1f phy9 addr=500e004aaaaaaa09 CQ hdr: 0x102b 0xa0ba9 0x1000 0x20000 Error info: 0x200 0x0 0x0 0x0

After analysis, it is concluded that: when one of the physical links
connected on the wide port is re-established, the link rate of the port
and expander device and the expander SATA PHY are not updated. As a
result, the expander PHY attached to a SATA PHY is using link rate
(6.0 Gbit) greater than the physical PHY link rate (3.0 Gbit).

Please mention the SAS spec section in which min pathway is described.


Therefore, add function sas_check_port_linkrate() to check whether the
link rate of physical PHY which is connected to the wide port changes
after the phy up occur, if the link rate of the newly established
physical phy is lower than the link rate of the port, a smaller link rate
value is transmitted to port->linkrate.

Use the sas_update_linkrate_root_expander() function to update the root
expander link rate. Traverse all expanders connected to the port, check
and update expander PHYs that need to be updated and triggers revalidation.

So are you saying that you want to lower the linkrate on all pathways to the SATA disk? In your example, that would be 3Gbps. If so, won't that affect the end-to-end linkrate of all other devices attached (and lower throughput drastically)?