Re: [PATCH] coccinelle: semantic patch to check for inappropriate do_div() calls

From: Julia Lawall
Date: Sat Jan 04 2020 - 08:55:02 EST




--- Please note the new email address ---


On Sat, 4 Jan 2020, Wen Yang wrote:

>
>
> On 2020/1/4 4:55 äå, Julia Lawall wrote:
> > On Sat, 4 Jan 2020, Wen Yang wrote:
> >
> > >
> > >
> > > On 2020/1/4 3:00 äå, Julia Lawall wrote:
> > > > On Sat, 4 Jan 2020, Wen Yang wrote:
> > > >
> > > > > do_div() does a 64-by-32 division.
> > > > > When the divisor is unsigned long, u64, or s64,
> > > > > do_div() truncates it to 32 bits, this means it
> > > > > can test non-zero and be truncated to zero for division.
> > > > > This semantic patch is inspired by Mateusz Guzik's patch:
> > > > > commit b0ab99e7736a ("sched: Fix possible divide by zero in avg_atom()
> > > > > calculation")
> > > > >
> > > > > Signed-off-by: Wen Yang <wenyang@xxxxxxxxxxxxxxxxx>
> > > > > Cc: Julia Lawall <Julia.Lawall@xxxxxxx>
> > > > > Cc: Gilles Muller <Gilles.Muller@xxxxxxx>
> > > > > Cc: Nicolas Palix <nicolas.palix@xxxxxxx>
> > > > > Cc: Michal Marek <michal.lkml@xxxxxxxxxxx>
> > > > > Cc: Matthias Maennich <maennich@xxxxxxxxxx>
> > > > > Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> > > > > Cc: Masahiro Yamada <yamada.masahiro@xxxxxxxxxxxxx>
> > > > > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > > > > Cc: cocci@xxxxxxxxxxxxxxx
> > > > > Cc: linux-kernel@xxxxxxxxxxxxxxx
> > > > > ---
> > > > > scripts/coccinelle/misc/do_div.cocci | 66
> > > > > ++++++++++++++++++++++++++++++++++++
> > > > > 1 file changed, 66 insertions(+)
> > > > > create mode 100644 scripts/coccinelle/misc/do_div.cocci
> > > > >
> > > > > diff --git a/scripts/coccinelle/misc/do_div.cocci
> > > > > b/scripts/coccinelle/misc/do_div.cocci
> > > > > new file mode 100644
> > > > > index 0000000..f1b72d1
> > > > > --- /dev/null
> > > > > +++ b/scripts/coccinelle/misc/do_div.cocci
> > > > > @@ -0,0 +1,66 @@
> > > > > +// SPDX-License-Identifier: GPL-2.0-only
> > > > > +/// do_div() does a 64-by-32 division.
> > > > > +/// When the divisor is unsigned long, u64, or s64,
> > > > > +/// do_div() truncates it to 32 bits, this means it
> > > > > +/// can test non-zero and be truncated to zero for division.
> > > > > +///
> > > > > +//# This makes an effort to find those inappropriate do_div () calls.
> > > > > +//
> > > > > +// Confidence: Moderate
> > > > > +// Copyright: (C) 2020 Wen Yang, Alibaba.
> > > > > +// Comments:
> > > > > +// Options: --no-includes --include-headers
> > > > > +
> > > > > +virtual context
> > > > > +virtual org
> > > > > +virtual report
> > > > > +
> > > > > +@depends on context@
> > > > > +expression f;
> > > > > +long l;
> > > > > +unsigned long ul;
> > > > > +u64 ul64;
> > > > > +s64 sl64;
> > > > > +
> > > > > +@@
> > > > > +(
> > > > > +* do_div(f, l);
> > > > > +|
> > > > > +* do_div(f, ul);
> > > > > +|
> > > > > +* do_div(f, ul64);
> > > > > +|
> > > > > +* do_div(f, sl64);
> > > > > +)
> > > > > +
> > > > > +@r depends on (org || report)@
> > > > > +expression f;
> > > > > +long l;
> > > > > +unsigned long ul;
> > > > > +position p;
> > > > > +u64 ul64;
> > > > > +s64 sl64;
> > > > > +@@
> > > > > +(
> > > > > +do_div@p(f, l);
> > > > > +|
> > > > > +do_div@p(f, ul);
> > > > > +|
> > > > > +do_div@p(f, ul64);
> > > > > +|
> > > > > +do_div@p(f, sl64);
> > > > > +)
> > > > > +
> > > > > +@script:python depends on org@
> > > > > +p << r.p;
> > > > > +@@
> > > > > +
> > > > > +msg="WARNING: WARNING: do_div() does a 64-by-32 division, which may
> > > > > truncation the divisor to 32-bit"
> > > > > +coccilib.org.print_todo(p[0], msg)
> > > > > +
> > > > > +@script:python depends on report@
> > > > > +p << r.p;
> > > > > +@@
> > > > > +
> > > > > +msg="WARNING: WARNING: do_div() does a 64-by-32 division, which may
> > > > > truncation the divisor to 32-bit"
> > > > > +coccilib.report.print_report(p[0], msg)
> > > >
> > > > A few small issues: You have WARNING: twice in each case, and truncation
> > > > should be truncate.
> > > >
> > >
> > > Thanks for your comments, we will fix it soon.
> > >
> > > > Is there any generic strategy for fixing these issues?
> > > >
> > >
> > > We have done some experiments, such as:
> > > https://lkml.org/lkml/2020/1/2/1354
> >
> > Thanks. Actually, I would appreciate knowing about such experiments when
> > the semantic patch is submitted, since eg in this case I am not really an
> > expert in this issue.
> >
> > >
> > > - avg = rec->time;
> > > - do_div(avg, rec->counter);
> > > + avg = div64_ul(rec->time, rec->counter);
> > >
> > > --> Function replacement was performed here,
> > > and simple code cleanup was also performed.
> > >
> > >
> > > - do_div(stddev, rec->counter * (rec->counter - 1) * 1000);
> > > + stddev = div64_ul(stddev,
> > > + rec->counter * (rec->counter - 1) * 1000);
> > >
> > > --> Only the function replacement is performed here (because the variable
> > > âstddevâ corresponds to a more complicated equation, cleaning it will
> > > reduce
> > > readability).
> >
> > Would it be reasonable to extend the warning to say "consider using
> > div64_ul instead"? Or do you think it is obvious to everyone?
> >
>
> Thank you for your comments.
> We plan to modify it as follows:
> msg="WARNING: do_div() does a 64-by-32 division, please consider using
> div64_ul, div64_long, div64_u64 or div64_s64 instead."

This message seems helpful, thanks.

julia

>
> > > In addition, there are some codes that do not need to be modified:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/net/can/dev.c#n263
> >
> > Would it be worth having a special case for constants and checking whether
> > the value is obviously safe and no warning is needed?
> >
> Thanks.
> This is very valuable in reducing false positives, and we'll try to implement
> it.
>
> --
> Best Wishes,
> Wen
>
> > > So we just print a warning.
> > > As for how to fix it, we need to analyze the code carefully.
> > >
> > > --
> > > Best Wishes,
> > > Wen
> > >
> > >
> > >
> >
>