Re: [PATCH v2] mtd: rawnand: brcmnand: Initial exec_op implementation

From: Miquel Raynal
Date: Fri Oct 06 2023 - 03:50:24 EST


Hi William,

william.zhang@xxxxxxxxxxxx wrote on Thu, 5 Oct 2023 17:42:21 -0700:

> Hi Miquel,
>
> On 10/03/2023 09:47 PM, William Zhang wrote:
> >
> >
> > On 10/03/2023 03:55 PM, Miquel Raynal wrote:
> >> Hi William,
> >>
> >> william.zhang@xxxxxxxxxxxx wrote on Tue, 3 Oct 2023 11:46:25 -0700:
> >>
> >>> Hi Miquel,
> >>>
> >>> On 10/03/2023 02:28 AM, Miquel Raynal wrote:
> >>>> Hi William,
> >>>>
> >>>> william.zhang@xxxxxxxxxxxx wrote on Mon, 2 Oct 2023 12:57:01 -0700:
> >>>>> Hi Miquel,
> >>>>>
> >>>>> On 10/02/2023 05:35 AM, Miquel Raynal wrote:
> >>>>>> Hi David,
> >>>>>>
> >>>>>> dregan@xxxxxxxx wrote on Sat, 30 Sep 2023 03:57:35 +0200:
> >>>>>>     >>>> Initial exec_op implementation for Broadcom STB, >>>>>> Broadband and iProc SoC
> >>>>>>> This adds exec_op and removes the legacy interface.
> >>>>>>>
> >>>>>>> Signed-off-by: David Regan <dregan@xxxxxxxx>
> >>>>>>> Reviewed-by: William Zhang <william.zhang@xxxxxxxxxxxx>
> >>>>>>>
> >>>>>>> ---
> >>>>>>>    >>>
> >>>>>> ...
> >>>>>>     >>>> +static int brcmnand_parser_exec_matched_op(struct >>>>>> nand_chip *chip,
> >>>>>>> +                     const struct nand_subop *subop)
> >>>>>>> +{
> >>>>>>> +    struct brcmnand_host *host = nand_get_controller_data(chip);
> >>>>>>> +    struct brcmnand_controller *ctrl = host->ctrl;
> >>>>>>> +    struct mtd_info *mtd = nand_to_mtd(chip);
> >>>>>>> +    const struct nand_op_instr *instr = &subop->instrs[0];
> >>>>>>> +    unsigned int i;
> >>>>>>> +    int ret = 0;
> >>>>>>> +
> >>>>>>> +    for (i = 0; i < subop->ninstrs; i++) {
> >>>>>>> +        instr = &subop->instrs[i];
> >>>>>>> +
> >>>>>>> +        if ((instr->type == NAND_OP_CMD_INSTR) &&
> >>>>>>> +            (instr->ctx.cmd.opcode == NAND_CMD_STATUS))
> >>>>>>> +            ctrl->status_cmd = 1;
> >>>>>>> +        else if (ctrl->status_cmd && (instr->type == >>>>>>> NAND_OP_DATA_IN_INSTR)) {
> >>>>>>> +            /*
> >>>>>>> +             * need to fake the nand device write protect >>>>>>> because nand_base does a
> >>>>>>> +             * nand_check_wp which calls nand_status_op >>>>>>> NAND_CMD_STATUS which checks
> >>>>>>> +             * that the nand is not write protected before an >>>>>>> operation starts.
> >>>>>>> +             * The problem with this is it's done outside >>>>>>> exec_op so the nand is
> >>>>>>> +             * write protected and this check will fail until >>>>>>> the write or erase
> >>>>>>> +             * or write back operation actually happens where we >>>>>>> turn off wp.
> >>>>>>> +             */
> >>>>>>> +            u8 *in;
> >>>>>>> +
> >>>>>>> +            ctrl->status_cmd = 0;
> >>>>>>> +
> >>>>>>> +            instr = &subop->instrs[i];
> >>>>>>> +            in = instr->ctx.data.buf.in;
> >>>>>>> +            in[0] = brcmnand_status(host) | NAND_STATUS_WP; /* >>>>>>> hide WP status */
> >>>>>>
> >>>>>> I don't understand why you are faking the WP bit. If it's set,
> >>>>>> brcmnand_status() should return it and you should not care about >>>>>> it. If
> >>>>>> it's not however, can you please give me the path used when we have
> >>>>>> this issue? Either we need to modify the core or we need to provide
> >>>>>> additional helpers in this driver to circumvent the faulty path.
> >>>>>
> >>>>> The reason we have to hide wp status for status command is because
> >>>>> nand_base calls nand_check_wp at the very beginning of write and erase
> >>>>> function. This applies to both exec_op path and legacy path. With
> >>>>> Broadcom nand controller and most of our board design using the WP pin
> >>>>> and have it asserted by default, the nand_check_wp function will fail
> >>>>> and write/erase aborts.  This workaround has been there before this
> >>>>> exec_op patch.
> >>>>>
> >>>>> I agree it is ugly and better to be addressed in the nand base >>>>> code. And
> >>>>> I understand Broadcom's WP approach may sound a bit over cautious >>>>> but we
> >>>>> want to make sure no spurious erase/write can happen under any
> >>>>> circumstance except software explicitly want to write and erase. >>>>> WP is
> >>>>> standard nand chip pin and I think most the nand controller has that
> >>>>> that pin in the design too but it is possible it is not used and
> >>>>> bootloader can de-assert the pin and have a always-writable nand flash
> >>>>> for linux. So maybe we can add nand controller dts option >>>>> "nand-use-wp".
> >>>>> If this property exist and set to 1,  wp control is in use and nand
> >>>>> driver need to control the pin on/ff as needed when doing write and
> >>>>> erase function. Also nand base code should not call nand_check_wp when
> >>>>> wp is in use. Then we can remove the faking WP status workaround.
> >>>>>>     >>>> +        } else if (instr->type == NAND_OP_WAITRDY_INSTR) {
> >>>>>>> +            ret = bcmnand_ctrl_poll_status(host, NAND_CTRL_RDY, >>>>>>> NAND_CTRL_RDY, 0);
> >>>>>>> +            if (ctrl->wp_cmd) {
> >>>>>>> +                ctrl->wp_cmd = 0;
> >>>>>>> +                brcmnand_wp(mtd, 1);
> >>>>>>
> >>>>>> This ideally should disappear.
> >>>>>>     >> Maybe we can have the destructive operation patch from Borris.
> >>>>> Controller driver still need to assert/deassert the pin if it uses >>>>> nand
> >>>>> wp feature but at least it does not need to guess the op code.
> >>>>
> >>>> Ah, yeah, I get it.
> >>>>
> >>>> Please be my guest, you can revive this patch series (might need light
> >>>> tweaking, nothing big) and also take inspiration from it if necessary:
> >>>> https://github.com/bbrezillon/linux/commit/e612e1f2c69a33ac5f2c91d13669f0f172d58717 >>>>
> >>>> https://github.com/bbrezillon/linux/commit/4ec6f8d8d83f5aaca5d1877f02d48da96d41fcba >>>>
> >>>> https://github.com/bbrezillon/linux/commit/11b4acffd761c4928652d7028d19fcd6f45e4696 >>>>
> >>> Sure we will incorporate the destructive operation patch and provide a
> >>> new revision.
> >>>
> >>> The WP status workaround will stay at least for this change. If you
> >>> think my suggestion using a dts setting above is okay, we can provide a
> >>> patch for that as well.  Or if you have any other idea or suggestion,
> >>> we'd like to hear too.
> >>
> >> I thought this was not needed as Boris initial conversion did not need
> >> it. The goal is to get rid of this workaround.
> >> Boris' initial patch did remove that workaround but it will break the
> > board that uses WP pin because the nand_check_wp run before the exec_op > and status returned is write-protected in the erase and write function.
> > I explained that above and you can see the code here:
> > https://elixir.bootlin.com/linux/v6.6-rc4/source/drivers/mtd/nand/raw/nand_base.c#L4599 > >
> > I agree with your goal to remove this workaround and we have suggested
> > one possible fix but we are also open to any other solution.
> >
> We have integrated the destructive operation patch and are ready for the
> v3. If you don't think my proposal on the WP status fix is a good idea,
> can we get this exce_op conversion patch series going first? After all,
> we don't modify the WP status handling behavior in this patch. We can
> fix it in another patch whenever we agree on a solution. Please let me
> know and thanks a lot for all your comments and thoughts.

The NAND core has been a playground for coding horrors sometimes, and
this ->exec_op() conversion is us the way to a cleaner and mastered
approach, I am not willing to let something that obvious get in, I'm
sorry. For you it's just a workaround, for me it means any change in
the core will just break with this controller.

This is of course not against you or your work, perhaps I should
emphasize that I strongly appreciate your efforts and, besides this
workaround the code is clean.

The problem is that the WP pin can be used in two different ways:
internally and externally. When it's used externally, you expect it
to be deasserted before you start a destructive operation. When you use
it internally, you expect it to be deasserted during the destructive
operation.

The final solution needs to be approved by comparing with
similar drivers which perform this internal procedure themselves
as well. Maybe we could add a flag somewhere in the core's controller
structure to tell the core not to perform these checks as we master the
handling of the WP pin, telling the controller will handle it
correctly as long as the destructive flag is passed.

Thanks, Miquèl