Re: [PATCH v3 4/6] mtd: rawnand: add NVIDIA Tegra NAND Flash controller driver

From: Boris Brezillon
Date: Sat Jun 09 2018 - 02:55:54 EST


On Sat, 9 Jun 2018 08:46:15 +0200
Boris Brezillon <boris.brezillon@xxxxxxxxxxx> wrote:

> On Sat, 9 Jun 2018 08:41:57 +0200
> Boris Brezillon <boris.brezillon@xxxxxxxxxxx> wrote:
>
> > On Sat, 09 Jun 2018 08:23:51 +0200
> > Stefan Agner <stefan@xxxxxxxx> wrote:
> >
> > > On 09.06.2018 07:52, Boris Brezillon wrote:
> > > > On Fri, 08 Jun 2018 23:51:01 +0200
> > > > Stefan Agner <stefan@xxxxxxxx> wrote:
> > > >
> > > >
> > > >> >
> > > >> > void tegra_nand_controller_reset(struct tegra_nand_controller *ctrl)
> > > >> > {
> > > >> > int err;
> > > >> >
> > > >> > disable_irq(ctrl->irq);
> > > >> >
> > > >> > err = reset_control_reset(ctrl->rst);
> > > >> > if (err) {
> > > >> > dev_err(ctrl->dev, "Failed to reset HW: %d\n", err);
> > > >> > msleep(HW_TIMEOUT);
> > > >> > }
> > > >> >
> > > >> > writel_relaxed(NAND_CMD_STATUS, ctrl->regs + HWSTATUS_CMD);
> > > >> > writel_relaxed(HWSTATUS_MASK, ctrl->regs + HWSTATUS_MASK);
> > > >> > writel_relaxed(INT_MASK, ctrl->regs + ISR);
> > > >>
> > > >> If we do a controller reset, there is much more state than that which
> > > >> needs to be restored. A lot of it is not readily available currently
> > > >> (timing, ECC settings...)
> > > >
> > > > This is actually a good test to detect what is not properly initialized
> > > > by the driver. Timings should be configured correctly through
> > > > ->setup_data_interface(). ECC engine should be disabled by default and
> > > > only enabled when ->{read,write}_page() is called.
> > > >
> > >
> > > Is setup_data_interface guaranteed to be called after a failed
> > > ->exec_op()/{read,write}_page()?
> >
> > No. Maybe I misunderstood when tegra_nand_controller_reset() was
> > supposed to be called. That's something I would call only once, early
> > in the probe function, so that the controller is placed in a well-known
> > state before we start using it. Definitely not something you should
> > call after each error.
>
> Note that if you really want to reset the controller after an error,
> you should also iterate over all chips and call nand_reset() on them.

And that's clearly not possible to call nand_reset() from ->exec_op(),
otherwise you might recurse indefinitely in ->exec_op() if it keeps
failing, because nand_reset() relies on ->exec_op() to reset the chip.
So, as you said initially, not a good idea to reset the controller in
this case. But maybe you can clear the interrupts, mask them and cancel
the current operation (if any).