RE: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan NAND Flash Controller

From: Naga Sureshkumar Relli
Date: Mon Jan 28 2019 - 01:05:08 EST


Hi Boris & Miquel,

Could you please provide your thoughts on this driver to support HW-ECC?
As I said previously, there is no way to detect errors beyond N bit.
I am ok to update the driver based on your inputs.

Thanks,
Naga Sureshkumar Relli

> -----Original Message-----
> From: linux-mtd [mailto:linux-mtd-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Naga
> Sureshkumar Relli
> Sent: Friday, December 21, 2018 1:06 PM
> To: Miquel Raynal <miquel.raynal@xxxxxxxxxxx>
> Cc: robh@xxxxxxxxxx; marek.vasut@xxxxxxxxx; richard@xxxxxx; martin.lund@keep-it-
> simple.com; linux-kernel@xxxxxxxxxxxxxxx; Boris Brezillon <boris.brezillon@xxxxxxxxxxx>;
> linux-mtd@xxxxxxxxxxxxxxxxxxx; nagasuresh12@xxxxxxxxx; Michal Simek
> <michals@xxxxxxxxxx>; computersforpeace@xxxxxxxxx; dwmw2@xxxxxxxxxxxxx
> Subject: RE: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support for Arasan
> NAND Flash Controller
>
> Hi Miquel,
>
> > -----Original Message-----
> > From: Miquel Raynal [mailto:miquel.raynal@xxxxxxxxxxx]
> > Sent: Wednesday, December 19, 2018 7:57 PM
> > To: Naga Sureshkumar Relli <nagasure@xxxxxxxxxx>
> > Cc: Boris Brezillon <boris.brezillon@xxxxxxxxxxx>; robh@xxxxxxxxxx;
> > richard@xxxxxx; linux-kernel@xxxxxxxxxxxxxxx; marek.vasut@xxxxxxxxx;
> > linux-mtd@xxxxxxxxxxxxxxxxxxx; nagasuresh12@xxxxxxxxx; Michal Simek
> > <michals@xxxxxxxxxx>; computersforpeace@xxxxxxxxx;
> > dwmw2@xxxxxxxxxxxxx; martin.lund@xxxxxxxxxxxxxxxxxx
> > Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add support
> > for Arasan NAND Flash Controller
> >
> > Hi Naga,
> >
> > + Martin
> >
> > Naga Sureshkumar Relli <nagasure@xxxxxxxxxx> wrote on Tue, 18 Dec 2018
> > 05:33:53 +0000:
> >
> > > Hi Miquel,
> > >
> > > > -----Original Message-----
> > > > From: Miquel Raynal [mailto:miquel.raynal@xxxxxxxxxxx]
> > > > Sent: Monday, December 17, 2018 10:11 PM
> > > > To: Naga Sureshkumar Relli <nagasure@xxxxxxxxxx>
> > > > Cc: Boris Brezillon <boris.brezillon@xxxxxxxxxxx>;
> > > > robh@xxxxxxxxxx; richard@xxxxxx; linux- kernel@xxxxxxxxxxxxxxx;
> > > > marek.vasut@xxxxxxxxx; linux-mtd@xxxxxxxxxxxxxxxxxxx;
> > > > nagasuresh12@xxxxxxxxx; Michal Simek <michals@xxxxxxxxxx>;
> > > > computersforpeace@xxxxxxxxx; dwmw2@xxxxxxxxxxxxx
> > > > Subject: Re: [LINUX PATCH v12 3/3] mtd: rawnand: arasan: Add
> > > > support for Arasan NAND Flash Controller
> > > >
> > > > Hi Naga,
> > > >
> > > > [...]
> > > >
> > > > > Inserted biterror @ 48/7
> > > > > Successfully corrected 25 bit errors per subpage Inserted
> > > > > biterror @
> > > > > 50/7 ECC failure, invalid data despite read success
> > > > > root@xilinx-zc1751-dc2-2018_1:~#
> > > > >
> > > > > But even in this case also, driver is saying ECC failure but read success.
> > > > > That means controller is able to detect errors on read page up to 24 bit only.
> > > > > After that there is no way to say to the upper layers that the
> > > > > page is bad because of the
> > > > limitation in the controller.
> > > >
> > > > This is more than a "limitation", the design is broken. I am not
> > > > sure how to support such controller, and I am not sure if we even want to.
> > >
> > > The number of errors that are correctable is limited by a parameter
> > > 't'(total number of errors), If there is a condition that the number
> > > of errors greater than 't',
> > then the controller won't be able to detect that.
> > > I guess this concept is same for other controllers as well.
> > > In Arasan it is limited to 24-bit.
> > >
> > > Even, in case of Hamming, it is 1-bit error correction and 2-bit error detection.
> > > What will happen if there are multiple errors(greater than 2-bit)?
> >
> > Ok let's use the Hamming comparison in your ECC engine case.
> >
> > -> hamming:
> > * 0 bf: everything is fine
> > * 1 bf: will be detected, corrected, signaled
> > * 2 bf: will be detected, not corrected, signaled
> > * 3+ bf: don't care
> >
> > -> BCH:
> > * 0 bf: everything is fine
> > * 1-24 bf: will be detected, corrected, signaled
> > * 25 bf: everything is fine
> > * 26+ bf: don't care
> >
> > Do you see the problem?
> No.
> >
> > In the 25 bf case, the controller is reporting that everything went
> > fine while it should report that it detected an uncorrectable situation.
> >
> > Here are two leads to solve this issue, please investigate them both:
> > 1/ Talk to your colleagues that developed the RTL, ask if there is a
> > hidden/reserved bit for that purpose that is not documented.
> I spoke to RTL guys, there is nothing hidden/reserved bit for this purpose.
> I tried reading the status registers reserved bits, but they are raz(read as zero)
>
> > 2/ Search for a status in the registers that might indicate that an
> > error occurred, for instance "0 bf corrected" and "bf have been
> > detected".
> I tried reading status registers and other registers as well, but no luck.
> >
> > NB: I know that, with a BCH ECC engine, error detection at (strength +
> > 1) is not 100% sure but statistically it will almost always be
> > detected and in this case we need the controller to warn the user!
> Ok, I understood now.
>
> Thanks,
> Naga Sureshkumar Relli
> >
> >
> > Thanks,
> > MiquÃl
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/