Re: [PATCH v4 11/13] hwmon: peci: Add dimmtemp driver

From: Guenter Roeck
Date: Tue Nov 23 2021 - 10:56:13 EST


On Tue, Nov 23, 2021 at 03:07:04PM +0100, Iwona Winiarska wrote:
> Add peci-dimmtemp driver for Temperature Sensor on DIMM readings that
> are accessible via the processor PECI interface.
>
> The main use case for the driver (and PECI interface) is out-of-band
> management, where we're able to obtain thermal readings from an external
> entity connected with PECI, e.g. BMC on server platforms.
>
> Co-developed-by: Jae Hyun Yoo <jae.hyun.yoo@xxxxxxxxxxxxxxx>
> Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@xxxxxxxxxxxxxxx>
> Signed-off-by: Iwona Winiarska <iwona.winiarska@xxxxxxxxx>
> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@xxxxxxxxxxxxxxx>
> ---

[ ... ]

> +static int check_populated_dimms(struct peci_dimmtemp *priv)
> +{
> + int chan_rank_max = priv->gen_info->chan_rank_max;
> + int dimm_idx_max = priv->gen_info->dimm_idx_max;
> + u32 chan_rank_empty = 0;
> + u64 dimm_mask = 0;
> + int chan_rank, dimm_idx, ret;
> + u32 pcs;
> +
> + BUILD_BUG_ON(BITS_PER_TYPE(chan_rank_empty) < CHAN_RANK_MAX);
> + BUILD_BUG_ON(BITS_PER_TYPE(dimm_mask) < DIMM_NUMS_MAX);
> + if (chan_rank_max * dimm_idx_max > DIMM_NUMS_MAX) {
> + WARN_ONCE(1, "Unsupported number of DIMMs - chan_rank_max: %d, dimm_idx_max: %d",
> + chan_rank_max, dimm_idx_max);
> + return -EINVAL;
> + }
> +
> + for (chan_rank = 0; chan_rank < chan_rank_max; chan_rank++) {
> + ret = peci_pcs_read(priv->peci_dev, PECI_PCS_DDR_DIMM_TEMP, chan_rank, &pcs);
> + if (ret) {
> + /*
> + * Overall, we expect either success or -EINVAL in
> + * order to determine whether DIMM is populated or not.
> + * For anything else we fall back to deferring the
> + * detection to be performed at a later point in time.
> + */
> + if (ret == -EINVAL) {
> + chan_rank_empty |= BIT(chan_rank);
> + continue;
> + }
> +
> + return -EAGAIN;
> + }
> +
> + for (dimm_idx = 0; dimm_idx < dimm_idx_max; dimm_idx++)
> + if (__dimm_temp(pcs, dimm_idx))
> + dimm_mask |= BIT(chan_rank * dimm_idx_max + dimm_idx);
> + }
> +
> + /*
> + * If we got all -EINVALs, it means that the CPU doesn't have any
> + * DIMMs. Unfortunately, it may also happen at the very start of
> + * host platform boot. Retrying a couple of times lets us make sure
> + * that the state is persistent.
> + */
> + if (chan_rank_empty == GENMASK(chan_rank_max - 1, 0)) {
> + if (priv->no_dimm_retry_count < NO_DIMM_RETRY_COUNT_MAX) {
> + priv->no_dimm_retry_count++;
> +
> + return -EAGAIN;
> + } else {
> + return -ENODEV;
> + }

Static analyzers will complain "else after return is unnecessary".

Guenter