Re: [PATCH -next] mtd: nand: Add support for Toshiba BENAND (Built-in ECC NAND)

From: KOBAYASHI Yoshitake
Date: Mon Jun 05 2017 - 01:37:48 EST


Hi Boris,

Thank you very much for your detailed review.
I just comming back from busy week for conference and started to look it.

I have a question regarding to the following comment.

>> static int toshiba_nand_init(struct nand_chip *chip)
>> {
>> + struct mtd_info *mtd = nand_to_mtd(chip);
>> +
>> if (nand_is_slc(chip))
>> chip->bbt_options |= NAND_BBT_SCAN2NDPAGE;
>>
>> + if (chip->ecc.mode == NAND_ECC_BENAND) {
>> + chip->ecc.options = NAND_ECC_CUSTOM_PAGE_ACCESS;
>> + chip->ecc.bytes = 0;
>
> It should be set to 16 according to the datasheet. But I guess this is
> not exactly true. I'm pretty sure we can use some of these bytes to
> store real data. Assuming you're using BCH, only 13bytes are needed for
> 8bits/512bytes strength, and I guess the BBM region is also left
> untouched (first 2 bytes of the OOB region).

On BENAND, all OOB reginon can be used by user (driver). The calculated
ECC data stored into other isolated area which is ubable to access from user.
This is why chip->ecc.bytes = 0.
To make sure this specification, I also checked the BENAND datasheet
but unfortunatelly it was undocumented. Sorry for this confusion.

I think I can add comment here or add definition somewhere to describe this
specification.
If you have any preference or suggestion to describe this, please let me know.

Best regards,
Yoshi


On 2017/05/27 1:22, Boris Brezillon wrote:
> Hi KOBAYASHI,
>
> Le Fri, 26 May 2017 10:15:35 +0900,
> KOBAYASHI Yoshitake <yoshitake.kobayashi@xxxxxxxxxxxxx> a Ãcrit :
>
>> This patch enables support for Toshiba BENAND. It was originally posted on
>> https://lkml.org/lkml/2015/7/24/571
>>
>> This patch is rewrote to adapt the recent mtd "separate vendor specific code
>> from core" cleanup process.
>>
>> Signed-off-by: KOBAYASHI Yoshitake <yoshitake.kobayashi@xxxxxxxxxxxxx>
>> ---
>> drivers/mtd/nand/Kconfig | 17 ++++++
>> drivers/mtd/nand/nand_base.c | 15 ++++++
>> drivers/mtd/nand/nand_toshiba.c | 112 +++++++++++++++++++++++++++++++++++++++-
>> include/linux/mtd/nand.h | 1 +
>> 4 files changed, 143 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/mtd/nand/Kconfig b/drivers/mtd/nand/Kconfig
>> index 0bd2319..6634d4b 100644
>> --- a/drivers/mtd/nand/Kconfig
>> +++ b/drivers/mtd/nand/Kconfig
>> @@ -36,6 +36,23 @@ config MTD_NAND_ECC_BCH
>> ECC codes. They are used with NAND devices requiring more than 1 bit
>> of error correction.
>>
>> +config MTD_NAND_BENAND
>> + bool "Support for Toshiba BENAND (Built-in ECC NAND)"
>> + default n
>> + help
>> + This enables support for Toshiba BENAND.
>> + Toshiba BENAND is a SLC NAND solution that automatically
>> + generates ECC inside NAND chip.
>> +
>> +config MTD_NAND_BENAND_ECC_STATUS
>> + bool "Enable ECC Status Read Command(0x7A)"
>> + depends on MTD_NAND_BENAND
>> + default n
>> + help
>> + This enables support for ECC Status Read Command(0x7A) of BENAND.
>> + When this enables, report the real number of bitflips.
>> + In other cases, report the assumud number.
>> +
>
> Please drop the Kconfig options. The amount of code added here is quite
> small, and I don't think we need to compile it out. If the vendor code
> continues to grow we'll see how we want to deal with that, but we're
> not there yet.

OK. I will delete it.

>> config MTD_SM_COMMON
>> tristate
>> default n
>> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
>> index 43722a8..ab8652e 100644
>> --- a/drivers/mtd/nand/nand_base.c
>> +++ b/drivers/mtd/nand/nand_base.c
>> @@ -4186,6 +4186,7 @@ static const char * const nand_ecc_modes[] = {
>> [NAND_ECC_HW_SYNDROME] = "hw_syndrome",
>> [NAND_ECC_HW_OOB_FIRST] = "hw_oob_first",
>> [NAND_ECC_ON_DIE] = "on-die",
>> + [NAND_ECC_BENAND] = "benand",
>
> No. BENAND and on-die ECC are the same thing (not the same
> implementation, but the same feature). Please re-use the existing
> ECC_ON_DIE definition.

I will try to implement by using the existing definition. Thanks.

>> };
>>
>> static int of_get_nand_ecc_mode(struct device_node *np)
>> @@ -4751,6 +4752,19 @@ int nand_scan_tail(struct mtd_info *mtd)
>> ecc->write_oob = nand_write_oob_std;
>> break;
>>
>> + case NAND_ECC_BENAND:
>> + if (!ecc->read_page || !ecc->read_subpage) {
>> + WARN(1, "No ECC functions supplied; benand ECC not possible\n");
>> + ret = -EINVAL;
>> + goto err_free;
>> + }
>> + ecc->write_page = nand_write_page_raw;
>> + ecc->read_page_raw = nand_read_page_raw;
>> + ecc->write_page_raw = nand_write_page_raw;
>> + ecc->read_oob = nand_read_oob_std;
>> + ecc->write_oob = nand_write_oob_std;
>> + break;
>
> Ditto.
>
>> +
>> case NAND_ECC_NONE:
>> pr_warn("NAND_ECC_NONE selected by board driver. This is not recommended!\n");
>> ecc->read_page = nand_read_page_raw;
>> @@ -4831,6 +4845,7 @@ int nand_scan_tail(struct mtd_info *mtd)
>> /* Large page NAND with SOFT_ECC should support subpage reads */
>> switch (ecc->mode) {
>> case NAND_ECC_SOFT:
>> + case NAND_ECC_BENAND:
>
> And here as well.
>
>> if (chip->page_shift > 9)
>> chip->options |= NAND_SUBPAGE_READ;
>> break;
>> diff --git a/drivers/mtd/nand/nand_toshiba.c b/drivers/mtd/nand/nand_toshiba.c
>> index fa787ba..bb3c852 100644
>> --- a/drivers/mtd/nand/nand_toshiba.c
>> +++ b/drivers/mtd/nand/nand_toshiba.c
>> @@ -17,6 +17,97 @@
>>
>> #include <linux/mtd/nand.h>
>>
>> +/* ECC Status Read Command for BENAND */
>> +#define NAND_CMD_ECC_STATUS 0x7A
>
> Can you prefix the name with TOSHIBA_:

Yes. It makes much clear.

> #define TOSHIBA_NAND_CMD_ECC_STATUS 0x7a
>
>> +
>> +/* Recommended to rewrite for BENAND */
>> +#define NAND_STATUS_RECOM_REWRT 0x08
>
> Ditto, add a TOSHIBA somewhere:
>
> #define TOSHIBA_NAND_STATUS_REWRITE_RECOMMENDED BIT(3)
>
> But anyway, I'm not sure we want to use this metric since we have the
> number of corrected bitflips thanks to the TOSHIBA_NAND_CMD_ECC_STATUS
> command.
>
>> +
>> +
>
> Drop the extra empty line.
>
>> +static int toshiba_nand_benand_status_chk(struct mtd_info *mtd,
>> + struct nand_chip *chip)
>
> Try to align parameters on the open parenthesis when you have multiple
> lines.
>
>> +{
>> + unsigned int max_bitflips = 0;
>> + u8 status;
>> +
>> + /* Check Read Status */
>> + chip->cmdfunc(mtd, NAND_CMD_STATUS, -1, -1);
>> + status = chip->read_byte(mtd);
>> +
>> + /* timeout */
>> + if (!(status & NAND_STATUS_READY)) {
>> + pr_debug("BENAND : Time Out!\n");
>> + return -EIO;
>> + }
>
> Hm, I think this one has already been checked by the core.
>
>> +
>> + /* uncorrectable */
>> + else if (status & NAND_STATUS_FAIL)
>> + mtd->ecc_stats.failed++;
>
> You have all the information to add the exact number of failures (it's
> a per-sector information, not per-page).
>
>> +
>> + /* correctable */
>> + else if (status & NAND_STATUS_RECOM_REWRT) {
>> + if (chip->cmd_ctrl &&
>> + IS_ENABLED(CONFIG_MTD_NAND_BENAND_ECC_STATUS)) {
>> +
>> + int i;
>> + u8 ecc_status;
>> + unsigned int bitflips;
>> +
>> + /* Check Read ECC Status */
>> + chip->cmd_ctrl(mtd, NAND_CMD_ECC_STATUS,
>> + NAND_NCE | NAND_CLE | NAND_CTRL_CHANGE);
>> + /* Get bitflips info per 512Byte */
>> + for (i = 0; i < mtd->writesize >> 9; i++) {
>> + ecc_status = chip->read_byte(mtd);
>> + bitflips = ecc_status & 0x0f;
>> + max_bitflips = max_t(unsigned int,
>> + max_bitflips, bitflips);
>> + }
>> + mtd->ecc_stats.corrected += max_bitflips;
>> + } else {
>> + /*
>> + * If can't use chip->cmd_ctrl,
>> + * we can't get real number of bitflips.
>> + * So, we set max_bitflips mtd->bitflip_threshold.
>> + */
>
> And that's exactly the kind of situation I want to avoid. I hate the
> fact that, depending on the NAND controller, we have this feature
> working or not. Well, if it's a real limitation of the
> controller, that's acceptable, but most of the time it's just a driver
> limitation.
>
> I'd like to have the ->exec_op() [1] infrastructure ready before we
> start adding vendor specific commands. It probably needs to be extended
> with an ->supports_op() hook to ask the controller whether it supports a
> specific operation or not and let the core/vendor driver decide whether
> this part can be attached to the controller based on the result.
>
>> + max_bitflips = mtd->bitflip_threshold;
>> + mtd->ecc_stats.corrected += max_bitflips;
>> + }
>> + }
>
> For the if () ... else if () ... blocks please try to do:
>
> if (...) {
> /* comment here */
> ...
> } else if (...) {
> /* comment here */
> ...
> } else {
> /* comment here */
> ...
> }
>
>> +
>> + return max_bitflips;
>> +}
>> +
>> +static int toshiba_nand_read_page_benand(struct mtd_info *mtd,
>> + struct nand_chip *chip, uint8_t *buf,
>> + int oob_required, int page)
>> +{
>> + unsigned int max_bitflips = 0;
>> +
>> + chip->ecc.read_page_raw(mtd, chip, buf, oob_required, page);
>> + max_bitflips = toshiba_nand_benand_status_chk(mtd, chip);
>> +
>> + return max_bitflips;
>> +}
>> +
>> +static int toshiba_nand_read_subpage_benand(struct mtd_info *mtd,
>> + struct nand_chip *chip, uint32_t data_offs,
>> + uint32_t readlen, uint8_t *bufpoi, int page)
>> +{
>> + uint8_t *p;
>> + unsigned int max_bitflips = 0;
>> +
>> + if (data_offs != 0)
>> + chip->cmdfunc(mtd, NAND_CMD_RNDOUT, data_offs, -1);
>> +
>> + p = bufpoi + data_offs;
>> + chip->read_buf(mtd, p, readlen);
>> +
>> + max_bitflips = toshiba_nand_benand_status_chk(mtd, chip);
>> +
>> + return max_bitflips;
>> +}
>> +
>> static void toshiba_nand_decode_id(struct nand_chip *chip)
>> {
>> struct mtd_info *mtd = nand_to_mtd(chip);
>> @@ -33,15 +124,32 @@ static void toshiba_nand_decode_id(struct nand_chip *chip)
>> */
>> if (chip->id.len >= 6 && nand_is_slc(chip) &&
>> (chip->id.data[5] & 0x7) == 0x6 /* 24nm */ &&
>> - !(chip->id.data[4] & 0x80) /* !BENAND */)
>> - mtd->oobsize = 32 * mtd->writesize >> 9;
>> + (chip->id.data[4] & 0x80) /* BENAND */) {
>> + if (IS_ENABLED(CONFIG_MTD_NAND_BENAND))
>> + chip->ecc.mode = NAND_ECC_BENAND;
>
> No, you should not set that explicitly. This choice should be left to
> the user. Now, since the internal ECC engine cannot be disabled here,
> you should fail in toshiba_nand_init() if you are dealing with a BENAND
> and chip->ecc.mode != NAND_ECC_ON_DIE.
>
>
>> + } else {
>> + mtd->oobsize = 32 * mtd->writesize >> 9; /* !BENAND */
>> + }
>
> You're changing the ID decoding logic here.
>
> It should be:
>
> if (chip->id.len >= 6 && nand_is_slc(chip) &&
> (chip->id.data[5] & 0x7) == 0x6 /* 24nm */) {
> if (chip->id.data[4] & 0x80)
> chip->ecc.mode = NAND_ECC_BENAND;
> else
> mtd->oobsize = 32 * mtd->writesize >> 9;
> }
>> }
>>
>> static int toshiba_nand_init(struct nand_chip *chip)
>> {
>> + struct mtd_info *mtd = nand_to_mtd(chip);
>> +
>> if (nand_is_slc(chip))
>> chip->bbt_options |= NAND_BBT_SCAN2NDPAGE;
>>
>> + if (chip->ecc.mode == NAND_ECC_BENAND) {
>> + chip->ecc.options = NAND_ECC_CUSTOM_PAGE_ACCESS;
>> + chip->ecc.bytes = 0;
>
> It should be set to 16 according to the datasheet. But I guess this is
> not exactly true. I'm pretty sure we can use some of these bytes to
> store real data. Assuming you're using BCH, only 13bytes are needed for
> 8bits/512bytes strength, and I guess the BBM region is also left
> untouched (first 2 bytes of the OOB region).

On BENAND, user can control spare area.
BENAND stores ECC parity date into un accessible chip->ecc.bytes


ææèããããããããBENANDãäæãéããèèãããããããããäèååãããããèåãéããããã
ããäãããããããããèæãããã
äæçãããããããéçãéããããã

BENANDããåéé(Spare Area)ãåãããããèçãæãããããããããã"chip->ecc.bytes = 0"ããããããã
ããããããããæèãããããããããã
BENANDãECCããããããããããããããããããããéåãäæãããããã
(åééããããããããååãããããããããéåããã)
ãããããBENANDããåéé(Spare Area)ããECCããããããããäåããããã

>
>> + chip->ecc.strength = 8;
>> + chip->ecc.total = 0;
>
> No need to explicitly set that one, but you should set chip->ecc.size
> to 512.
>
>> + chip->ecc.read_page = toshiba_nand_read_page_benand;
>> + chip->ecc.read_subpage = toshiba_nand_read_subpage_benand;
>> +
>> + mtd_set_ooblayout(mtd, &nand_ooblayout_lp_ops);
>> + }
>
> Should be:
>
>
> if (chip->id.len >= 6 && nand_is_slc(chip) &&
> (chip->id.data[5] & 0x7) == 0x6 &&
> (chip->id.data[4] & 0x80)) {
> /* BENAND */
>
> /*
> * We can't disable the internal ECC engine, the user
> * has to use on-die ECC, there is no alternative.
> */
> if (chip->ecc.mode != NAND_ECC_ON_DIE)
> return -EINVAL;
>
> ....
> }
>
>> +
>> return 0;
>> }
>>
>> diff --git a/include/linux/mtd/nand.h b/include/linux/mtd/nand.h
>> index 3a6a77f..6821334 100644
>> --- a/include/linux/mtd/nand.h
>> +++ b/include/linux/mtd/nand.h
>> @@ -117,6 +117,7 @@ typedef enum {
>> NAND_ECC_HW_SYNDROME,
>> NAND_ECC_HW_OOB_FIRST,
>> NAND_ECC_ON_DIE,
>> + NAND_ECC_BENAND,
>
> As explained above: this is unneeded.
>
>> } nand_ecc_modes_t;
>>
>> enum nand_ecc_algo {
>
> [1]https://github.com/bbrezillon/linux/commits/nand/exec_op1
>
>