RE: [PATCH v3] cfi: fix deadloop in cfi_cmdset_0002.c do_write_buffer

From: Tokunori Ikegami
Date: Fri Mar 01 2019 - 12:00:12 EST


> [...]
> >>>>> In function do_write_buffer(), in the for loop, there is a case
> >>>>> chip_ready() returns 1 while chip_good() returns 0, so it never break
> >>>>> the loop.
> >>>>> To fix this, chip_good() is enough and it should timeout if it stay
> >>>>> bad for a while.
> >>>>>
> >>>>> Fixes: dfeae1073583("mtd: cfi_cmdset_0002: Change write buffer to
> >>>>> check correct value")
> >>>>> Signed-off-by: Yi Huaijie <yihuaijie@xxxxxxxxxx>
> >>>>> Signed-off-by: Liu Jian <liujian56@xxxxxxxxxx>
> >>>>> Reviewed-by: Tokunori Ikegami <ikegami_to@xxxxxxxxxxx>
> >>>>> ---
> >>>>> v2->v3:
> >>>>> Follow Vignesh's advice:
> >>>>> add one more check for check_good() even when time_after() returns
> >> true.
> >>>>>
> >>>>> drivers/mtd/chips/cfi_cmdset_0002.c | 2 +-
> >>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>
> >>>>> diff --git a/drivers/mtd/chips/cfi_cmdset_0002.c
> >>>>> b/drivers/mtd/chips/cfi_cmdset_0002.c
> >>>>> index 72428b6..3da2376 100644
> >>>>> --- a/drivers/mtd/chips/cfi_cmdset_0002.c
> >>>>> +++ b/drivers/mtd/chips/cfi_cmdset_0002.c
> >>>>> @@ -1876,7 +1876,7 @@ static int __xipram do_write_buffer(struct
> >>>>> map_info *map, struct flchip *chip,
> >>>>> continue;
> >>>>> }
> >>>>>
> >>>>> - if (time_after(jiffies, timeo) && !chip_ready(map, adr))
> >>>>> + if (time_after(jiffies, timeo) && !chip_good(map, adr,
> >>>>> datum))
> >>>>
> >>>> Just another idea to understand easily.
> >>>>
> >>>> unsigned long now = jiffies;
> >>>>
> >>>> if (chip_good(map, adr, datum)) {
> >>>> xip_enable(map, chip, adr);
> >>>> goto op_done;
> >>>> }
> >>>>
> >>>> if (time_after(now, timeo) {
> >>>> break;
> >>>> }
> >>>>
> >>>
> >>> Thank you~. It is more easier to understandï
> >>> If there are no other comments, I will send new patch again ):
> >>
> >> Except this version no longer does what Vignesh suggested. See how you
> >> no longer test if chip_good() is true if time_after() returns true. So,
> >> imagine the thread entering this function is preempted just after the
> >> first chip_good() test, and resumed a few ms later. time_after() will
> >> return true, but chip_good() might also return true, and you ignore it.
> >
> > I think that the following 3 versions will be worked for time_after()
> as a same result and follow the Vignesh-san suggestion.
> >
>
> As Boris explained above version 3 does not really follow my
> suggestion... Please see below
>
> > 1. Original Vignesh-san suggestion
> >
> > if (chip_good(map, adr, datum)) {
> > xip_enable(map, chip, adr);
> > goto op_done;
> > }
> >
> > if (time_after(jiffies, timeo)) {
> > /* Test chip_good() if TRUE incorrectly again so write
> failure by time_after() can be avoided. */
> > if (chip_good(map, adr, datum)) {
> > xip_enable(map, chip, adr);
> > goto op_done;
> > }
> > break;
> > }
> >
>
>
> Right, here we check chip_good() once _even when time_after() is true_
> to avoid _spurious_ timeout
>
> > 2. Liujian-san v3 patch
> >
> > /* Test chip_good() if FALSE correctly so write failure by
> time_after() can be avoided. */
> > if (time_after(jiffies, timeo) && !chip_good(map, adr))
> > break;
> >
> > if (chip_good(map, adr, datum)) {
> > xip_enable(map, chip, adr);
> > goto op_done;
> > }
> >
>
> This is a better version of 1
>
> > 3. My idea
> >
> > /* Save current jiffies value before chip_good() to avoid write
> failure by time_after() without testing chip_good() again. */
> > unsigned long now = jiffies;
> >
> > if (chip_good(map, adr, datum)) {
> > xip_enable(map, chip, adr);
> > goto op_done;
> > }
> >
>
> What if thread gets pre-empted at this point and is re-scheduled exactly
> after timeo jiffies have elapsed? Below check would be true and exit loop

I think that the jiffies value now is save before chip_good() so below check would be false and not exit loop.

Regards,
Ikegami

>
> > if (time_after(now, timeo))
> > break;
> >
>
> So, code does not check for check chip_good() after timeoout has
> elapsed. But chip_good() might be true at this point. Therefore leading
> to spurious timeout. So this version is still not right.
>
> > Note: Some brackets have been fixed from the previous mail.
> >
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/