Re: [BUG] mtd: cfi_cmdset_0002: write regression since v4.17-rc1

From: Tokunori Ikegami
Date: Sun Feb 20 2022 - 07:22:46 EST


Hi Ahmad-san,

Could you please try the version 2 patch attached for the error case?
This version is to check the DQ true data 0xFF by chip_good().
But I am not sure if this works or not since the error is possible to be caused by Hi-Z 0xff on floating bus or etc.

On 2022/02/15 3:46, Tokunori Ikegami wrote:
Hi Ahmad-san,

On 2022/02/15 1:22, Ahmad Fatoum wrote:
Hello Tokunori-san,

On 13.02.22 17:47, Tokunori Ikegami wrote:
Hi Ahmad-san,

Thanks for your confirmations. Sorry for late to reply.
No worries. I appreciate you taking the time.

Could you please try the patch attached to disable the chip_good() change as before?
I think this should work for S29GL964N since the chip_ready() is used and works as mentioned.
yes, this resolves my issue:
Tested-by: Ahmad Fatoum <a.fatoum@xxxxxxxxxxxxxx>
Thanks for your testing. I have just sent the patch to review.

Doesn't seem to be a buffered write issue here though as the writes
did work fine before dfeae1073583. Any other ideas?
At first I thought the issue is possible to be resolved by using the word write instead of the buffered writes.
Now I am thinking to disable the changes dfeae1073583 partially with any condition if possible.
What seems to work for me is checking if chip_good or chip_ready
and map_word is equal to 0xFF. I can't justify why this is ok though.
(Worst case bus is floating at this point of time and Hi-Z is read
as 0xff on CPU data lines...)
Sorry I am not sure about this.
I thought the chip_ready() itself is correct as implemented as the data sheet in the past.
But it did not work correctly so changed to use chip_good() instead as it is also correct.
What exactly in the datasheet makes you believe chip_good is not appropriate?
I just mentioned about the actual issue behaviors as not worked chip_good() on S29GL964N and not worked chip_ready() on MX29GL512FHT2I-11G before etc.
Anyway let me recheck the data sheet details as just checked it again quickly but needed more investigation to understand.

As far as I checked still both chip_good() and chip_ready() seem correct but still the root cause is unknown.
If as you mentioned the issue was cased by the DQ true data 0xFF I am not sure why the read work without any error after the write operation.
Also if the error was caused by the Hi-Z 0xff on floating bus as mentioned I am not sure why the read work without any error after the write operation with chip_ready().
Sorry anyway the root cause is also unknown when the write operation was changed to use chip_good() instead of chip_ready().

Regards,
Ikegami


Regards,
Ikegami


Cheers,
Ahmad

From 2bef02bee8fa74273cfc764e288b6f92b8646bb7 Mon Sep 17 00:00:00 2001
From: Tokunori Ikegami <ikegami.t@xxxxxxxxx>
Date: Sat, 19 Feb 2022 19:39:32 +0900
Subject: [PATCH v2] mtd: cfi_cmdset_0002: Change chip_good() to check DQ true
data 0xFF

The regression issue has been caused on S29GL064N and reported it.
The change mentioned for regression is to use chip_good() for buffered write.
Also it seems that the 0xFF value is read on the error case.
It is possible to be caused by DQ true data described by S29GL064N datasheet.
So change chip_good() to check DQ true data 0xFF additionally for the error.

Fixes: dfeae1073583("mtd: cfi_cmdset_0002: Change write buffer to check correct value")
Signed-off-by: Tokunori Ikegami <ikegami.t@xxxxxxxxx>
Cc: linux-mtd@xxxxxxxxxxxxxxxxxxx
Cc: stable@xxxxxxxxxxxxxxx
Link: https://lore.kernel.org/linux-mtd/cedb1604-e024-2738-5b33-15703a653803@xxxxxxxxx/
---
drivers/mtd/chips/cfi_cmdset_0002.c | 25 +++++++++++++++----------
1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/drivers/mtd/chips/cfi_cmdset_0002.c b/drivers/mtd/chips/cfi_cmdset_0002.c
index a761134fd3be..079f69e5400d 100644
--- a/drivers/mtd/chips/cfi_cmdset_0002.c
+++ b/drivers/mtd/chips/cfi_cmdset_0002.c
@@ -853,7 +853,7 @@ static int __xipram chip_ready(struct map_info *map, struct flchip *chip,
*
*/
static int __xipram chip_good(struct map_info *map, struct flchip *chip,
- unsigned long addr, map_word expected)
+ unsigned long addr, map_word *expected)
{
struct cfi_private *cfi = map->fldrv_priv;
map_word oldd, curd;
@@ -875,8 +875,13 @@ static int __xipram chip_good(struct map_info *map, struct flchip *chip,
oldd = map_read(map, addr);
curd = map_read(map, addr);

- return map_word_equal(map, oldd, curd) &&
- map_word_equal(map, curd, expected);
+ if (!map_word_equal(map, oldd, curd))
+ return 0;
+
+ if (expected && map_word_equal(map, curd, *expected))
+ return 1;
+
+ return map_word_equal(map, oldd, map_word_ff(map));
}

static int get_chip(struct map_info *map, struct flchip *chip, unsigned long adr, int mode)
@@ -1699,7 +1704,7 @@ static int __xipram do_write_oneword_once(struct map_info *map,
* "chip_good" to avoid the failure due to scheduling.
*/
if (time_after(jiffies, timeo) &&
- !chip_good(map, chip, adr, datum)) {
+ !chip_good(map, chip, adr, &datum)) {
xip_enable(map, chip, adr);
printk(KERN_WARNING "MTD %s(): software timeout\n", __func__);
xip_disable(map, chip, adr);
@@ -1707,7 +1712,7 @@ static int __xipram do_write_oneword_once(struct map_info *map,
break;
}

- if (chip_good(map, chip, adr, datum)) {
+ if (chip_good(map, chip, adr, &datum)) {
if (cfi_check_err_status(map, chip, adr))
ret = -EIO;
break;
@@ -1979,14 +1984,14 @@ static int __xipram do_write_buffer_wait(struct map_info *map,
* "chip_good" to avoid the failure due to scheduling.
*/
if (time_after(jiffies, timeo) &&
- !chip_good(map, chip, adr, datum)) {
+ !chip_good(map, chip, adr, &datum)) {
pr_err("MTD %s(): software timeout, address:0x%.8lx.\n",
__func__, adr);
ret = -EIO;
break;
}

- if (chip_good(map, chip, adr, datum)) {
+ if (chip_good(map, chip, adr, &datum)) {
if (cfi_check_err_status(map, chip, adr))
ret = -EIO;
break;
@@ -2282,7 +2287,7 @@ static int do_panic_write_oneword(struct map_info *map, struct flchip *chip,
udelay(1);
}

- if (!chip_good(map, chip, adr, datum) ||
+ if (!chip_good(map, chip, adr, &datum) ||
cfi_check_err_status(map, chip, adr)) {
/* reset on all failures. */
map_write(map, CMD(0xF0), chip->start);
@@ -2478,7 +2483,7 @@ static int __xipram do_erase_chip(struct map_info *map, struct flchip *chip)
chip->erase_suspended = 0;
}

- if (chip_good(map, chip, adr, map_word_ff(map))) {
+ if (chip_good(map, chip, adr, NULL)) {
if (cfi_check_err_status(map, chip, adr))
ret = -EIO;
break;
@@ -2577,7 +2582,7 @@ static int __xipram do_erase_oneblock(struct map_info *map, struct flchip *chip,
chip->erase_suspended = 0;
}

- if (chip_good(map, chip, adr, map_word_ff(map))) {
+ if (chip_good(map, chip, adr, NULL)) {
if (cfi_check_err_status(map, chip, adr))
ret = -EIO;
break;
--
2.32.0