Re: [PATCH v3 3/3] hwmon: (peci/dimmtemp) Add Sapphire Rapids support

From: Naresh Solanki
Date: Thu Jul 20 2023 - 03:50:22 EST


Hi Iwona,

On Thu, 20 Jul 2023 at 01:35, Winiarska, Iwona
<iwona.winiarska@xxxxxxxxx> wrote:
>
> On Wed, 2023-07-19 at 20:41 +0200, Naresh Solanki wrote:
> > From: Patrick Rudolph <patrick.rudolph@xxxxxxxxxxxxx>
> >
> > This patch extends the functionality of the hwmon (peci/dimmtemp) to
> > include support for Sapphire Rapids platform.
> >
> > Sapphire Rapids can accommodate up to 8 CPUs, each with 16 DIMMs. To
> > accommodate this configuration, the maximum supported DIMM count is
> > increased, and the corresponding Sapphire Rapids ID and threshold code
> > are added.
> >
> > The patch has been tested on a 4S system with 64 DIMMs installed.
> > Default thresholds are utilized for Sapphire Rapids, as accessing the
> > threshold requires accessing the UBOX device on Uncore bus 0, which can
> > only be achieved using MSR access. The non-PCI-compliant MMIO BARs are
> > not available for this purpose.
> >
> > Signed-off-by: Patrick Rudolph <patrick.rudolph@xxxxxxxxxxxxx>
> > Signed-off-by: Naresh Solanki <Naresh.Solanki@xxxxxxxxxxxxx>
> > Acked-by: Guenter Roeck <linux@xxxxxxxxxxxx>
> > ---
> > Changes in V3:
> > - Update Acked-by in commit message.
> > Changes in V2:
> > - Update subject.
> > ---
> > drivers/hwmon/peci/dimmtemp.c | 24 +++++++++++++++++++++++-
> > 1 file changed, 23 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/hwmon/peci/dimmtemp.c b/drivers/hwmon/peci/dimmtemp.c
> > index ed968401f93c..edafbfd66fef 100644
> > --- a/drivers/hwmon/peci/dimmtemp.c
> > +++ b/drivers/hwmon/peci/dimmtemp.c
> > @@ -30,8 +30,10 @@
> > #define DIMM_IDX_MAX_ON_ICX 2
> > #define CHAN_RANK_MAX_ON_ICXD 4
> > #define DIMM_IDX_MAX_ON_ICXD 2
> > +#define CHAN_RANK_MAX_ON_SPR 128
>
> Where was this number taken from?
> Single CPU has 8 channels (not 128), and dimmtemp hwmon binds to a single CPU.
>
> > +#define DIMM_IDX_MAX_ON_SPR 2
> >
> > -#define CHAN_RANK_MAX CHAN_RANK_MAX_ON_HSX
> > +#define CHAN_RANK_MAX CHAN_RANK_MAX_ON_SPR
>
> Then - there's no need for changing the MAX value.
>
> > #define DIMM_IDX_MAX DIMM_IDX_MAX_ON_HSX
> > #define DIMM_NUMS_MAX (CHAN_RANK_MAX * DIMM_IDX_MAX)
> >
> > @@ -530,6 +532,15 @@ read_thresholds_icx(struct peci_dimmtemp *priv, int
> > dimm_order, int chan_rank, u
> > return 0;
> > }
> >
> > +static int
> > +read_thresholds_spr(struct peci_dimmtemp *priv, int dimm_order, int
> > chan_rank, u32 *data)
> > +{
> > + /* Use defaults */
> > + *data = (95 << 16) | (90 << 8);
> > +
> > + return 0;
> > +}
> > +
>
> Rather than hardcoding the defaults, it should be possible to compute it in a
> similar way to ICX (and with that - commit message should be updated).
> We're starting from 1e:00.2 instead of 13:00.2, and offsets within IMC start
> from 0x219a8 with 0x8000 shift.
> It would look like this (note - not tested on actual SPR):
Thanks for the input. Will test & keep you posted.

Regards,
Naresh
>
> static int
> read_thresholds_spr(struct peci_dimmtemp *priv, int dimm_order, int chan_rank, u32 *data)
> {
> u32 reg_val;
> u64 offset;
> int ret;
> u8 dev;
>
> ret = peci_ep_pci_local_read(priv->peci_dev, 0, 30, 0, 2, 0xd4, &reg_val);
> if (ret || !(reg_val & BIT(31)))
> return -ENODATA; /* Use default or previous value */
>
> ret = peci_ep_pci_local_read(priv->peci_dev, 0, 30, 0, 2, 0xd0, &reg_val);
> if (ret)
> return -ENODATA; /* Use default or previous value */
>
> /*
> * Device 26, Offset 219a8: IMC 0 channel 0 -> rank 0
> * Device 26, Offset 299a8: IMC 0 channel 1 -> rank 1
> * Device 27, Offset 219a8: IMC 1 channel 0 -> rank 2
> * Device 27, Offset 299a8: IMC 1 channel 1 -> rank 3
> * Device 28, Offset 219a8: IMC 2 channel 0 -> rank 4
> * Device 28, Offset 299a8: IMC 2 channel 1 -> rank 5
> * Device 29, Offset 219a8: IMC 3 channel 0 -> rank 6
> * Device 29, Offset 299a8: IMC 3 channel 1 -> rank 7
> */
> dev = 26 + chan_rank / 2;
> offset = 0x219a8 + dimm_order * 4 + (chan_rank % 2) * 0x8000;
>
> ret = peci_mmio_read(priv->peci_dev, 0, GET_CPU_SEG(reg_val), GET_CPU_BUS(reg_val),
> dev, 0, offset, data);
> if (ret)
> return ret;
>
> return 0;
> }
>
> Thanks
> -Iwona
>
> > static const struct dimm_info dimm_hsx = {
> > .chan_rank_max = CHAN_RANK_MAX_ON_HSX,
> > .dimm_idx_max = DIMM_IDX_MAX_ON_HSX,
> > @@ -572,6 +583,13 @@ static const struct dimm_info dimm_icxd = {
> > .read_thresholds = &read_thresholds_icx,
> > };
> >
> > +static const struct dimm_info dimm_spr = {
> > + .chan_rank_max = CHAN_RANK_MAX_ON_SPR,
> > + .dimm_idx_max = DIMM_IDX_MAX_ON_SPR,
> > + .min_peci_revision = 0x40,
> > + .read_thresholds = &read_thresholds_spr,
> > +};
> > +
> > static const struct auxiliary_device_id peci_dimmtemp_ids[] = {
> > {
> > .name = "peci_cpu.dimmtemp.hsx",
> > @@ -597,6 +615,10 @@ static const struct auxiliary_device_id
> > peci_dimmtemp_ids[] = {
> > .name = "peci_cpu.dimmtemp.icxd",
> > .driver_data = (kernel_ulong_t)&dimm_icxd,
> > },
> > + {
> > + .name = "peci_cpu.dimmtemp.spr",
> > + .driver_data = (kernel_ulong_t)&dimm_spr,
> > + },
> > { }
> > };
> > MODULE_DEVICE_TABLE(auxiliary, peci_dimmtemp_ids);
>