Re: fsl_ifc_nand: are blank pages protected by ECC?

From: Pavel Machek
Date: Wed Apr 19 2017 - 18:15:15 EST


Hi!

> > We have some problems with fsl_ifc_nand ... in the old kernels, but
> > this one does not seem to be fixed in v4.11, either.
> >
> > UBIFS complains:
> >
> > UBIFS error (pid 931): ubifs_scan: corrupt empty space at LEB 282:252630
> > UBIFS error (pid 931): ubifs_scanned_corruption: corruption at LEB 282:252630
> > UBIFS error (pid 931): ubifs_scanned_corruption: first 1322 bytes from LEB 282:252630
> > UBIFS error (pid 931): ubifs_scan: LEB 282 scanning failed
> >
> > Possible explanation is here:
> >
> > https://e2e.ti.com/support/dsp/davinci_digital_media_processors/f/716/t/289605
> >
> > # I see on the forum that this issue has been raised before - my
> > # understanding is that the omap2 nand driver does not perform ECC
> > # detection/correction on empty pages so when UBIFS checks the empty
> > # space data and doesn't read all 0xFF then it fails and mounts
> > # read-only. I didn't find any good solution - only a workaround to
> > # remove the UBIFS check..
> >
> > So I checked fsl_ifc_nand.c in v4.11-rc, and yes, it seems to have the
> > same problem:
> >
> > if (errors == 15) {
> > /*
> > * Uncorrectable error.
> > * OK only if the whole page is blank.
> > *
> > * We disable ECCER reporting due to...
> > * erratum IFC-A002770 -- so report it now if we
> > * see an uncorrectable error in ECCSTAT.
> > */
> > if (!is_blank(mtd, bufnum))
> > ctrl->nand_stat |=
> > IFC_NAND_EVTER_STAT_ECCER;
> > break;
> > }
> >
> > is_blank() checks for all 0xff's, so single-bit 0xfe in the data will
> > result in_blank() == 0 and uncorrectable error being signaled.
> >
> > Should the driver be modified somehow?
>
> Yep, nand_check_erased_ecc_chunk() [1] is here to help you check this
> case, unfortunately, it's not directly applicable here, because this
> function takes regular pointers and not __iomem ones. You'll either
> have to copy the data in an intermediate buffer before calling
> nand_check_erased_ecc_chunk(), or cast the SRAM region to a void
> pointer (which is usually not a good idea). The last option would be to
> open code nand_check_erased_ecc_chunk(), but I'd really like to avoid
> that (for maintainability concerns).

Ok, thanks a lot for the pointer, that should be doable.

Core of the code is:

1357 for (; len >= sizeof(long);
1358 len -= sizeof(long), bitmap += sizeof(long)) {
1359 weight = hweight_long(*((unsigned long
*)bitmap));
1360 bitflips += BITS_PER_LONG - weight;
1361 if (unlikely(bitflips > bitflips_threshold))
1362 return -EBADMSG;
1363 }

Someone clearly optimized this code (took care to do long accesses
etc), but afaict hweight is quite a heavy operation:

_GLOBAL(__arch_hweight32)
BEGIN_FTR_SECTION
b __sw_hweight32
nop
nop
nop
nop
nop
nop
FTR_SECTION_ELSE
BEGIN_FTR_SECTION_NESTED(51)
PPC_POPCNTB(R3,R3)
srdi r4,r3,16
add r3,r4,r3
srdi r4,r3,8
add r3,r4,r3
clrldi r3,r3,64-8
blr
FTR_SECTION_ELSE_NESTED(51)
PPC_POPCNTW(R3,R3)
clrldi r3,r3,64-8
blr
ALT_FTR_SECTION_END_NESTED_IFCLR(CPU_FTR_POPCNTD, 51)
ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
EXPORT_SYMBOL(__arch_hweight32)

Would it make sense to only do hweight if *bitmap != ~0ULL ? Would it
make sense to only check for bitflips > bitflips_threshold each 128
bytes or something like that?

Thanks and best regards,
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

Attachment: signature.asc
Description: Digital signature