Re: [PATCH v2 11/53] mtd: nand: denali: fix bitflips calculation in handle_ecc()

From: Boris Brezillon
Date: Thu Mar 23 2017 - 04:13:06 EST


On Thu, 23 Mar 2017 16:02:02 +0900
Masahiro Yamada <yamada.masahiro@xxxxxxxxxxxxx> wrote:

> Hi Boris,
>
> 2017-03-23 5:57 GMT+09:00 Boris Brezillon <boris.brezillon@xxxxxxxxxxxxxxxxxx>:
> > On Wed, 22 Mar 2017 23:07:18 +0900
> > Masahiro Yamada <yamada.masahiro@xxxxxxxxxxxxx> wrote:
> >
> >> + do {
> >> + err_addr = ioread32(denali->flash_reg + ECC_ERROR_ADDRESS);
> >> + err_sector = ECC_SECTOR(err_addr);
> >> + err_byte = ECC_BYTE(err_addr);
> >> +
> >> + err_cor_info = ioread32(denali->flash_reg + ERR_CORRECTION_INFO);
> >> + err_cor_value = ECC_CORRECTION_VALUE(err_cor_info);
> >> + err_device = ECC_ERR_DEVICE(err_cor_info);
> >> +
> >> + /* reset the bitflip counter when crossing ECC sector */
> >> + if (err_sector != prev_sector)
> >> + bitflips = 0;
> >> +
> >> + if (ECC_ERROR_UNCORRECTABLE(err_cor_info)) {
> >> + /*
> >> + * if the error is not correctable, need to look at the
> >> + * page to see if it is an erased page. if so, then
> >> + * it's not a real ECC error
> >> + */
> >> + ret = -EBADMSG;
> >
> > You should never return -EBADMSG directly. Just increment
> > ecc_stats.failed and let the core return -EBADMSG to the upper layer.
> >
>
> Here, -EBADMSG is used like that returned from ->ecc.correct()
>
>
> Please notice denali_read_page() never returns -EBADMSG.
>
> -EBADMSG is used as a mark "we need erased page check".
>
>
> I think nand_read_page_syndrome() does similar;
> -EBADMSG is used internally.

That's not exactly what happens. nand_read_page_syndrome() calls
ecc->correct() for each chunk, and if this method returns -EBADMSG (and
nand_check_erased_ecc_chunk() returns -EBADMSG too) it increments the
ecc_stats.failed counter.

Here you check all chunks in the same function and only increment
ecc_stats.failed once in denali_read_page() even if several chunks are
uncorrectable.
You handle_ecc() should act like nand_read_page_syndrome() WRT ECC
checking: check each block one by one, call
nand_check_erased_ecc_chunk() if needed, increment ecc_stats.failed
when an uncorrectable error is detected, and return max_bitflips at the
end.