RE: [EXT] Re: [PATCH v2] mtd: rawnand: Ensure the nand chip supports cached reads

From: Domenico Punzo
Date: Tue Oct 03 2023 - 07:29:43 EST


Micron Confidential

Hello Miquel,

Here is a short list of devices having cache read with ECC enabled.

MT29F2G08ABAGAH4, MT29F2G08ABBGAH4, MT29F2G16ABBGAH4
MT29F1G08ABAFAH4, MT29F1G08ABBFAH4, MT29F1G16ABBFAH4

Thanks.
Regards,
Domenico P.




Micron Confidential
-----Original Message-----
From: Martin Hundeb+APg-ll +ADw-martin+AEA-geanix.com+AD4-
Sent: Thursday, September 28, 2023 9:20 AM
To: Miquel Raynal +ADw-miquel.raynal+AEA-bootlin.com+AD4-
Cc: Rouven Czerwinski +ADw-r.czerwinski+AEA-pengutronix.de+AD4AOw- M+AOU-ns Rullg+AOU-rd +ADw-mans+AEA-mansr.com+AD4AOw- Alexander Shiyan +ADw-eagle.alexander923+AEA-gmail.com+AD4AOw- Richard Weinberger +ADw-richard+AEA-nod.at+AD4AOw- Vignesh Raghavendra +ADw-vigneshr+AEA-ti.com+AD4AOw- JaimeLiao +ADw-jaimeliao.tw+AEA-gmail.com+AD4AOw- kernel+AEA-pengutronix.de+ADs- stable+AEA-vger.kernel.org+ADs- linux-mtd+AEA-lists.infradead.org+ADs- linux-kernel+AEA-vger.kernel.org+ADs- Sean Nyekj+AOY-r +ADw-sean+AEA-geanix.com+AD4AOw- Domenico Punzo +ADw-dpunzo+AEA-micron.com+AD4AOw- Bean Huo +ADw-beanhuo+AEA-micron.com+AD4-
Subject: +AFs-EXT+AF0- Re: +AFs-PATCH v2+AF0- mtd: rawnand: Ensure the nand chip supports cached reads

CAUTION: EXTERNAL EMAIL. Do not click links or open attachments unless you recognize the sender and were expecting this message.


Hi Miquel,

On Wed, 2023-09-27 at 17:05 +-0200, Miquel Raynal wrote:
+AD4- Hi Martin,
+AD4-
+AD4- miquel.raynal+AEA-bootlin.com wrote on Tue, 26 Sep 2023 13:27:25 +-0200:
+AD4-
+AD4- +AD4- Hi Martin,
+AD4- +AD4-
+AD4- +AD4- +- Bean and Domenico, there is a question for you below.
+AD4- +AD4-
+AD4- +AD4- martin+AEA-geanix.com wrote on Mon, 25 Sep 2023 13:01:06 +-0200:
+AD4- +AD4-
+AD4- +AD4- +AD4- Hi Rouven,
+AD4- +AD4- +AD4-
+AD4- +AD4- +AD4- On Fri, 2023-09-22 at 16:17 +-0200, Rouven Czerwinski wrote:
+AD4- +AD4- +AD4- +AD4- Both the JEDEC and ONFI specification say that read cache
+AD4- +AD4- +AD4- +AD4- sequential support is an optional command. This means that we
+AD4- +AD4- +AD4- +AD4- not only need to check whether the individual controller
+AD4- +AD4- +AD4- +AD4- supports the command, we also need to check the parameter pages
+AD4- +AD4- +AD4- +AD4- for both ONFI and JEDEC NAND flashes before enabling sequential
+AD4- +AD4- +AD4- +AD4- cache reads.
+AD4- +AD4- +AD4- +AD4-
+AD4- +AD4- +AD4- +AD4- This fixes support for NAND flashes which don't support enabling
+AD4- +AD4- +AD4- +AD4- cache reads, i.e. Samsung K9F4G08U0F or Toshiba TC58NVG0S3HTA00.
+AD4- +AD4- +AD4- +AD4-
+AD4- +AD4- +AD4- +AD4- Sequential cache reads are now only available for ONFI and JEDEC
+AD4- +AD4- +AD4- +AD4- devices, if individual vendors implement this, it needs to be
+AD4- +AD4- +AD4- +AD4- enabled per vendor.
+AD4- +AD4- +AD4- +AD4-
+AD4- +AD4- +AD4- +AD4- Tested on i.MX6Q with a Samsung NAND flash chip that doesn't
+AD4- +AD4- +AD4- +AD4- support sequential reads.
+AD4- +AD4- +AD4- +AD4-
+AD4- +AD4- +AD4- +AD4- Fixes: 003fe4b9545b (+ACI-mtd: rawnand: Support for sequential cache
+AD4- +AD4- +AD4- +AD4- reads+ACI-)
+AD4- +AD4- +AD4- +AD4- Cc: stable+AEA-vger.kernel.org
+AD4- +AD4- +AD4- +AD4- Signed-off-by: Rouven Czerwinski +ADw-r.czerwinski+AEA-pengutronix.de+AD4-
+AD4- +AD4- +AD4-
+AD4- +AD4- +AD4- Thanks for this. It works as expected for my Toshiba chip,
+AD4- +AD4- +AD4- obviously because it doesn't use ONFI or JEDEC.
+AD4- +AD4- +AD4-
+AD4- +AD4- +AD4- Unfortunately, my Micron chip does use ONFI, and it sets the
+AD4- +AD4- +AD4- cached-
+AD4- +AD4- +AD4- read-supported bit. It then fails when reading afterwords:
+AD4-
+AD4- I might have over reacted regarding my findings in Micron's datasheet,
+AD4- I need to know if you use the on-die ECC engine or if you use the one
+AD4- on the controller. In the former case the failure is expected. In the
+AD4- latter case, it's not.

I use the default, which seems to be the controller engine?

// Martin

+AD4- Thanks,
+AD4- Miqu+AOg-l
+AD4-
+AD4- +AD4- +AD4- kernel: ONFI+AF8-OPT+AF8-CMD+AF8-READ+AF8-CACHE +ACM- debug added by me
+AD4- +AD4- +AD4- kernel: nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xdc
+AD4- +AD4- +AD4- kernel: nand: Micron MT29F4G08ABAFAWP
+AD4- +AD4- +AD4- kernel: nand: 512 MiB, SLC, erase size: 256 KiB, page size: 4096,
+AD4- +AD4- +AD4- OOB
+AD4- +AD4- +AD4- size: 256
+AD4- +AD4- +AD4- kernel: nand: continued read supported +ACM- debug added by me
+AD4- +AD4- +AD4- kernel: Bad block table found at page 131008, version 0x01
+AD4- +AD4- +AD4- kernel: Bad block table found at page 130944, version 0x01
+AD4- +AD4- +AD4- kernel: 2 fixed-partitions partitions found on MTD device gpmi-
+AD4- +AD4- +AD4- nand
+AD4- +AD4- +AD4- kernel: Creating 2 MTD partitions on +ACI-gpmi-nand+ACI-:
+AD4- +AD4- +AD4- kernel: 0x000000000000-0x000000800000 : +ACI-boot+ACI-
+AD4- +AD4- +AD4- kernel: 0x000000800000-0x000020000000 : +ACI-ubi+ACI-
+AD4- +AD4- +AD4- kernel: gpmi-nand 1806000.nand-controller: driver registered.
+AD4- +AD4- +AD4-
+AD4- +AD4- +AD4- ...
+AD4- +AD4- +AD4-
+AD4- +AD4- +AD4- kernel: ubi0: default fastmap pool size: 100
+AD4- +AD4- +AD4- kernel: ubi0: default fastmap WL pool size: 50
+AD4- +AD4- +AD4- kernel: ubi0: attaching mtd1
+AD4- +AD4- +AD4- kernel: ubi0: scanning is finished
+AD4- +AD4- +AD4- kernel: ubi0: attached mtd1 (name +ACI-ubi+ACI-, size 504 MiB)
+AD4- +AD4- +AD4- kernel: ubi0: PEB size: 262144 bytes (256 KiB), LEB size: 253952
+AD4- +AD4- +AD4- bytes
+AD4- +AD4- +AD4- kernel: ubi0: min./max. I/O unit sizes: 4096/4096, sub-page size
+AD4- +AD4- +AD4- 4096
+AD4- +AD4- +AD4- kernel: ubi0: VID header offset: 4096 (aligned 4096), data
+AD4- +AD4- +AD4- offset: 8192
+AD4- +AD4- +AD4- kernel: ubi0: good PEBs: 2012, bad PEBs: 4, corrupted PEBs: 0
+AD4- +AD4- +AD4- kernel: ubi0: user volume: 9, internal volumes: 1, max. volumes
+AD4- +AD4- +AD4- count:
+AD4- +AD4- +AD4- 128
+AD4- +AD4- +AD4- kernel: ubi0: max/mean erase counter: 4/2, WL threshold: 4096,
+AD4- +AD4- +AD4- image sequence number: 1431497221
+AD4- +AD4- +AD4- kernel: ubi0: available PEBs: 12, total reserved PEBs: 2000, PEBs
+AD4- +AD4- +AD4- reserved for bad PEB handling: 36
+AD4- +AD4- +AD4- kernel: block ubiblock0+AF8-4: created from ubi0:4(rootfs.a)
+AD4- +AD4- +AD4- kernel: ubi0: background thread +ACI-ubi+AF8-bgt0d+ACI- started, PID 36
+AD4- +AD4- +AD4- kernel: block ubiblock0+AF8-6: created from ubi0:6(appfs.a)
+AD4- +AD4- +AD4- kernel: block ubiblock0+AF8-7: created from ubi0:7(appfs.b)
+AD4- +AD4- +AD4-
+AD4- +AD4- +AD4- ...
+AD4- +AD4- +AD4-
+AD4- +AD4- +AD4- kernel: SQUASHFS error: Unable to read directory block
+AD4- +AD4- +AD4- +AFs-4b6d15c:ed1+AF0-
+AD4- +AD4- +AD4- kernel: SQUASHFS error: Unable to read directory block
+AD4- +AD4- +AD4- +AFs-4b6f15e:125+AF0-
+AD4- +AD4- +AD4- kernel: SQUASHFS error: Unable to read directory block
+AD4- +AD4- +AD4- +AFs-4b6d15c:1dae+AF0-
+AD4- +AD4- +AD4- kernel: SQUASHFS error: Unable to read directory block
+AD4- +AD4- +AD4- +AFs-4b6d15c:ed1+AF0-
+AD4- +AD4- +AD4- (d-sysctl)+AFs-55+AF0-: systemd-sysctl.service: Failed to set up
+AD4- +AD4- +AD4- credentials:
+AD4- +AD4- +AD4- Protocol error
+AD4- +AD4- +AD4- kernel: SQUASHFS error: Unable to read directory block
+AD4- +AD4- +AD4- +AFs-4b73162:14f0+AF0-
+AD4- +AD4- +AD4- kernel: SQUASHFS error: Unable to read directory block
+AD4- +AD4- +AD4- +AFs-4b6f15e:838+AF0-
+AD4- +AD4- +AD4- systemd+AFs-1+AF0-: Starting Create Static Device Nodes in /dev...
+AD4- +AD4- +AD4- kernel: SQUASHFS error: Unable to read directory block
+AD4- +AD4- +AD4- +AFs-4b6d15c:ed1+AF0-
+AD4- +AD4- +AD4- kernel: SQUASHFS error: Unable to read directory block
+AD4- +AD4- +AD4- +AFs-4b6d15c:ed1+AF0-
+AD4- +AD4- +AD4- kernel: SQUASHFS error: Unable to read directory block
+AD4- +AD4- +AD4- +AFs-4b6f15e:838+AF0-
+AD4- +AD4- +AD4- kernel: SQUASHFS error: Unable to read directory block
+AD4- +AD4- +AD4- +AFs-4b6d15c:1dae+AF0-
+AD4- +AD4- +AD4- kernel: SQUASHFS error: Unable to read directory block
+AD4- +AD4- +AD4- +AFs-4b6f15e:125+AF0-
+AD4- +AD4- +AD4-
+AD4- +AD4- +AD4- I've briefly tried adding some error info the the squashfs error
+AD4- +AD4- +AD4- messages, but it looks like it's getting bad data. I.e. one
+AD4- +AD4- +AD4- failure a sanity check of +AGA-dir+AF8-count+AGA-:
+AD4- +AD4- +AD4-
+AD4- +AD4- +AD4- if (dir+AF8-count +AD4- SQUASHFS+AF8-DIR+AF8-COUNT)
+AD4- +AD4- +AD4- goto data+AF8-error+ADs-
+AD4- +AD4- +AD4-
+AD4- +AD4- +AD4- It fails with +AGA-dir+AF8-count+AGA- being 1952803684 ...
+AD4- +AD4- +AD4-
+AD4- +AD4- +AD4- So is this a case of wrong/bad timings?
+AD4- +AD4- +AD4-
+AD4- +AD4- +AD4- Miquel:
+AD4- +AD4- +AD4- I can tell from the code, that the READCACHESEQ operations are
+AD4- +AD4- +AD4- followed by NAND+AF8-OP+AF8-WAIT+AF8-RDY(tR+AF8-max, tRR+AF8-min). From the Micron
+AD4- +AD4- +AD4- datasheet+AFs-0+AF0-, it should be NAND+AF8-OP+AF8-WAIT+AF8-RDY(tRCBSY+AF8-max, tRR+AF8-min),
+AD4- +AD4- +AD4- where tRCBSY is defined to be between 3 and 25 +ALU-s.
+AD4- +AD4-
+AD4- +AD4- I found a place in the ONFI spec states taht tRCBSY+AF8-max should be
+AD4- +AD4- between 3 and tR+AF8-max, so indeed we should be fine on that regard.
+AD4- +AD4-
+AD4- +AD4- However, I asked myself whether we could have issues when crossing
+AD4- +AD4- boundaries. Block boundaries should be fine, however your device
+AD4- +AD4- does not support crossing plane boundaries, as bit 4 (+ACI-read cache
+AD4- +AD4- supported+ACI-) of byte 114 (+ACI-Multi-plane operation attributes+ACI-) in the
+AD4- +AD4- memory organization block of the parameter page is not set (the
+AD4- +AD4- value of the byte should be 0x0E if I get it right.
+AD4- +AD4-
+AD4- +AD4- Anyway, our main issue here does not seem related to the boundaries.
+AD4- +AD4- It does not seem to be explicitly marked anywhere else but on the
+AD4- +AD4- front
+AD4- +AD4- page:
+AD4- +AD4- Advanced command set
+AD4- +AD4- +IBM- Program page cache mode (4)
+AD4- +AD4- +IBM- Read page cache mode (4)
+AD4- +AD4- +IBM- Two-plane commands (4)
+AD4- +AD4-
+AD4- +AD4- (4) These commands supported only with ECC disabled.
+AD4- +AD4-
+AD4- +AD4- Read page cache mode without ECC makes the feature pretty useless
+AD4- +AD4- IMHO.
+AD4- +AD4-
+AD4- +AD4- Bean, Domenico, how do we know which devices allow ECC correction
+AD4- +AD4- during sequential page reads and which don't? Is there a (vendor?)
+AD4- +AD4- bit somewhere in the parameter page for that? Do we have any way to
+AD4- +AD4- know besides a list of devices allowing that? If so, can you provide
+AD4- +AD4- one with a few IDs?
+AD4- +AD4-
+AD4- +AD4- Thanks,
+AD4- +AD4- Miqu+AOg-l