Re: [PATCH v7] PCI/ASPM: Update LTR threshold based upon reported max latencies

From: Matthias Kaehlcke
Date: Fri Nov 18 2022 - 15:27:40 EST


On Fri, Sep 16, 2022 at 01:38:37PM +0530, Krishna chaitanya chundru wrote:
> In ASPM driver, LTR threshold scale and value are updated based on
> tcommon_mode and t_poweron values. In Kioxia NVMe L1.2 is failing due to
> LTR threshold scale and value are greater values than max snoop/non-snoop
> value.
>
> Based on PCIe r4.1, sec 5.5.1, L1.2 substate must be entered when
> reported snoop/no-snoop values is greater than or equal to
> LTR_L1.2_THRESHOLD value.
>
> Signed-off-by: Prasad Malisetty <quic_pmaliset@xxxxxxxxxxx>
> Signed-off-by: Krishna chaitanya chundru <quic_krichai@xxxxxxxxxxx>
> Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx>
> ---
>
> I am taking this patch forward as prasad is no more working with our org.
> changes since v6:
> - Rebasing with pci/next.
> changes since v5:
> - no changes, just reposting as standalone patch instead of reply to
> previous patch.
> Changes since v4:
> - Replaced conditional statements with min and max.
> changes since v3:
> - Changed the logic to include this condition "snoop/nosnoop
> latencies are not equal to zero and lower than LTR_L1.2_THRESHOLD"
> Changes since v2:
> - Replaced LTRME logic with max snoop/no-snoop latencies check.
> Changes since v1:
> - Added missing variable declaration in v1 patch
> ---
> drivers/pci/pcie/aspm.c | 30 ++++++++++++++++++++++++++++++
> 1 file changed, 30 insertions(+)
>
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index 928bf64..2bb8470 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -486,13 +486,35 @@ static void aspm_calc_l1ss_info(struct pcie_link_state *link,
> {
> struct pci_dev *child = link->downstream, *parent = link->pdev;
> u32 val1, val2, scale1, scale2;
> + u32 max_val, max_scale, max_snp_scale, max_snp_val, max_nsnp_scale, max_nsnp_val;
> u32 t_common_mode, t_power_on, l1_2_threshold, scale, value;
> u32 ctl1 = 0, ctl2 = 0;
> u32 pctl1, pctl2, cctl1, cctl2;
> + u16 ltr;
> + u16 max_snoop_lat, max_nosnoop_lat;
>
> if (!(link->aspm_support & ASPM_STATE_L1_2_MASK))
> return;
>
> + ltr = pci_find_ext_capability(child, PCI_EXT_CAP_ID_LTR);
> + if (!ltr)
> + return;
> +
> + pci_read_config_word(child, ltr + PCI_LTR_MAX_SNOOP_LAT, &max_snoop_lat);
> + pci_read_config_word(child, ltr + PCI_LTR_MAX_NOSNOOP_LAT, &max_nosnoop_lat);
> +
> + max_snp_scale = (max_snoop_lat & PCI_LTR_SCALE_MASK) >> PCI_LTR_SCALE_SHIFT;
> + max_snp_val = max_snoop_lat & PCI_LTR_VALUE_MASK;
> +
> + max_nsnp_scale = (max_nosnoop_lat & PCI_LTR_SCALE_MASK) >> PCI_LTR_SCALE_SHIFT;
> + max_nsnp_val = max_nosnoop_lat & PCI_LTR_VALUE_MASK;
> +
> + /* choose the greater max scale value between snoop and no snoop value*/
> + max_scale = max(max_snp_scale, max_nsnp_scale);
> +
> + /* choose the greater max value between snoop and no snoop scales */
> + max_val = max(max_snp_val, max_nsnp_val);

I suppose the goal is to dermine the max of snoop/no-snoop latency. Taking the
max of scale and value separately won't yield the correct result when the scale
is different for snoop vs no-snoop. Instead you need to convert scale + value
to ns and determine the max of that (as done for 't_power_on').

> +
> /* Choose the greater of the two Port Common_Mode_Restore_Times */
> val1 = (parent_l1ss_cap & PCI_L1SS_CAP_CM_RESTORE_TIME) >> 8;
> val2 = (child_l1ss_cap & PCI_L1SS_CAP_CM_RESTORE_TIME) >> 8;
> @@ -525,6 +547,14 @@ static void aspm_calc_l1ss_info(struct pcie_link_state *link,
> */
> l1_2_threshold = 2 + 4 + t_common_mode + t_power_on;
> encode_l12_threshold(l1_2_threshold, &scale, &value);
> +
> + /*
> + * Based on PCIe r4.1, sec 5.5.1, L1.2 substate must be entered when reported
> + * snoop/no-snoop values are greater than or equal to LTR_L1.2_THRESHOLD value.
> + */
> + scale = min(scale, max_scale);
> + value = min(value, max_val);

This is wrong for the same reason as above, as Bjorn also pointed out in an earlier
comment.

Even with the values calculated correctly I'm not sure it this is the right
thing to do, but I know little about LTR. In any case this patch reduces
power consumption of the Kioxia NVMe significantly, essentially by
programming an LTR_L1.2_THRESHOLD of 0ns instead of 86ns.

I just came across Bjorn's recent 'PCI/ASPM: Fix L1SS issues' series [1] that
corrects the LTR_L1.2_THRESHOLD calculation and landed upstream, unfortunately
it doesn't magically fix the issues with the Kioxia NVMe :(

[1] https://lore.kernel.org/lkml/ca836507-50ca-13bc-ef88-7f69b1333c99@xxxxxxxxxxxxxxx/T/#m3f223b7e582052159de7538bcaf2bea6d2071472