Re: [PATCH v4] i2c: designware: Fix corrupted memory seen in the ISR

From: Serge Semin
Date: Mon Nov 13 2023 - 04:48:31 EST


On Thu, Nov 09, 2023 at 03:19:27AM +0000, Jan Bottorff wrote:
> When running on a many core ARM64 server, errors were
> happening in the ISR that looked like corrupted memory. These
> corruptions would fix themselves if small delays were inserted
> in the ISR. Errors reported by the driver included "i2c_designware
> APMC0D0F:00: i2c_dw_xfer_msg: invalid target address" and
> "i2c_designware APMC0D0F:00:controller timed out" during
> in-band IPMI SSIF stress tests.
>
> The problem was determined to be memory writes in the driver were not
> becoming visible to all cores when execution rapidly shifted between
> cores, like when a register write immediately triggers an ISR.
> Processors with weak memory ordering, like ARM64, make no
> guarantees about the order normal memory writes become globally
> visible, unless barrier instructions are used to control ordering.
>
> To solve this, regmap accessor functions configured by this driver
> were changed to use non-relaxed forms of the low-level register
> access functions, which include a barrier on platforms that require
> it. This assures memory writes before a controller register access are
> visible to all cores. The community concluded defaulting to correct
> operation outweighed defaulting to the small performance gains from
> using relaxed access functions. Being a low speed device added weight to
> this choice of default register access behavior.
>
> Signed-off-by: Jan Bottorff <janb@xxxxxxxxxxxxxxxxxxxxxx>

The patch has already been merged in, but in anyway:
Tested-by: Serge Semin <fancer.lancer@xxxxxxxxx>
Reviewed-by: Serge Semin <fancer.lancer@xxxxxxxxx>

For the record. Next time don't forget to add all the reviewers to the
Cc-list on the patch(set) respin, otherwise the new version might get
to be missed amongst the other messages in their inbox. That in its
turn won't let them test it out and finish the review on time.

-Serge(y)

> ---
> ChangeLog
> v3->v4: add missing changelog
> v2->v3: regmap accessors use non-relaxed form instead of wmb barrier
> v1->v2: Commit message improvements
> v1: insert wmb barrier before enabling interrupts
>
> drivers/i2c/busses/i2c-designware-common.c | 16 ++++++++--------
> 1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/i2c/busses/i2c-designware-common.c b/drivers/i2c/busses/i2c-designware-common.c
> index affcfb243f0f..35f762872b8a 100644
> --- a/drivers/i2c/busses/i2c-designware-common.c
> +++ b/drivers/i2c/busses/i2c-designware-common.c
> @@ -63,7 +63,7 @@ static int dw_reg_read(void *context, unsigned int reg, unsigned int *val)
> {
> struct dw_i2c_dev *dev = context;
>
> - *val = readl_relaxed(dev->base + reg);
> + *val = readl(dev->base + reg);
>
> return 0;
> }
> @@ -72,7 +72,7 @@ static int dw_reg_write(void *context, unsigned int reg, unsigned int val)
> {
> struct dw_i2c_dev *dev = context;
>
> - writel_relaxed(val, dev->base + reg);
> + writel(val, dev->base + reg);
>
> return 0;
> }
> @@ -81,7 +81,7 @@ static int dw_reg_read_swab(void *context, unsigned int reg, unsigned int *val)
> {
> struct dw_i2c_dev *dev = context;
>
> - *val = swab32(readl_relaxed(dev->base + reg));
> + *val = swab32(readl(dev->base + reg));
>
> return 0;
> }
> @@ -90,7 +90,7 @@ static int dw_reg_write_swab(void *context, unsigned int reg, unsigned int val)
> {
> struct dw_i2c_dev *dev = context;
>
> - writel_relaxed(swab32(val), dev->base + reg);
> + writel(swab32(val), dev->base + reg);
>
> return 0;
> }
> @@ -99,8 +99,8 @@ static int dw_reg_read_word(void *context, unsigned int reg, unsigned int *val)
> {
> struct dw_i2c_dev *dev = context;
>
> - *val = readw_relaxed(dev->base + reg) |
> - (readw_relaxed(dev->base + reg + 2) << 16);
> + *val = readw(dev->base + reg) |
> + (readw(dev->base + reg + 2) << 16);
>
> return 0;
> }
> @@ -109,8 +109,8 @@ static int dw_reg_write_word(void *context, unsigned int reg, unsigned int val)
> {
> struct dw_i2c_dev *dev = context;
>
> - writew_relaxed(val, dev->base + reg);
> - writew_relaxed(val >> 16, dev->base + reg + 2);
> + writew(val, dev->base + reg);
> + writew(val >> 16, dev->base + reg + 2);
>
> return 0;
> }
> --
> 2.34.1
>