Re: [PATCH v1 4/5] mtd: rawnand: meson: clear OOB buffer before read

From: Miquel Raynal
Date: Tue May 02 2023 - 08:17:17 EST


Hi Arseniy,

Richard, your input is welcome below :-)

> >>>>>>> I just checked JFFS2 mount/umount again, here is what i see:
> >>>>>>> 0) First attempt to mount JFFS2.
> >>>>>>> 1) It writes OOB to page N (i'm using raw write). It is cleanmarker value 0x85 0x19 0x03 0x20. Mount is done.
> >>>>>>> 2) Umount JFFS2. Done.
> >>>>>>> 3) Second attempt to mount JFFS2.
> >>>>>>> 4) It reads OOB from page N (i'm using raw read). Value is 0x85 0x19 0x03 0x20. Done.
> >>>>>>> 5) It reads page N in ECC mode, and i get:
> >>>>>>>      jffs2: mtd->read(0x100 bytes from N) returned ECC error
> >>>>>>> 6) Mount failed.
> >>>>>>>
> >>>>>>> We already had problem which looks like this on another device. Solution was to use OOB area which is
> >>>>>>> not covered by ECC for JFFS2 cleanmarkers.
> >>>>>
> >>>>> ok, so there is not ECC parity bytes and mtd->read() returns ECC error.
> >>>>> does it have to use raw write/read on step 1) and 4)?
> >>>>>
> >>>>
> >>>> If i'm using non raw access to OOB, for example write OOB (user bytes) in ECC mode, then
> >>>> steps 1) and 4) and 5) passes ok, but write to this page will be impossible (for example JFFS2
> >>>> writes to such pages later) - we can't update ECC codes properly without erasing whole page.
> >>>> Write operation will be done without problem, but read will trigger ECC errors due to broken
> >>>> ECC codes.
> >>>>
> >>>> In general problem that we discuss is that in current implementation data and OOB conflicts
> >>>> with each other by sharing same ECC codes, these ECC codes could be written only once (without
> >>>> erasing), while data and OOB has different callbacks to access and thus supposed to work
> >>>> separately.
> >>>
> >>> The fact that there might be helpers just for writing OOB areas or just
> >>> in-band areas are optimizations. NAND pages are meant to be written a
> >>> single time, no matter what portion you write. In some cases, it is
> >>> possible to perform subpage writes if the chip supports it. Pages may
> >>> be split into several areas which cover a partial in-band area *and* a
> >>> partial OOB area. If you write into the in-band *or* out-of-band areas
> >>> of a given subpage, you *cannot* write the other part later without
> >>
> >> Thanks for details! So in case of JFFS2 it looks like strange, that it tries
> >> to write page after writing clean markers to it before? In the old vendor's
> >> driver OOB write callback is suppressed by return 0 always and JFFS2 works
> >> correctly.
> >
> > Can you point the code you're mentioning? (both what JFFS2 which looks
> > strange to you and the old vendor hack)
>
> Here is version of the old vendor's driver:
>
> https://github.com/kszaq/linux-amlogic/blob/master_new_amports/drivers/amlogic/nand/nand/aml_nand.c#L3260
>
> In my version there is no BUG() there, but it is same driver for the same chip.
>
> About JFFS2 - i didn't check its source code, but what I can see using printk(), is that it first
> tries to write cleanmarker using OOB write callback. Then later it tries to write to this page, so
> may be it is unexpected behaviour of JFFS2?

TBH I am not knowledgeable about JFFS2, maybe Richard can help here.

Are you sure you flash is recognized by JFFS2 as being a NAND device?
Did you enable CONFIG_JFFS2_FS_WRITEBUFFER correctly? Because
cleanmarker seem to be discarded when using a NAND device, and
recognizing the device as a NAND device requires the above option to be
set apparently.

Thanks,
Miquèl