Re: [PATCH] module: print module name on refcount error

From: Michal Hocko
Date: Tue Jul 04 2023 - 09:05:40 EST


On Tue 04-07-23 14:43:12, Jean Delvare wrote:
> Hi Michal,
>
> On Wed, 28 Jun 2023 12:30:35 +0200, Michal Hocko wrote:
> > On Mon 26-06-23 12:32:52, Jean Delvare wrote:
> > > If module_put() triggers a refcount error, include the culprit
> > > module name in the warning message, to easy further investigation of
> > > the issue.
> > >
> > > Signed-off-by: Jean Delvare <jdelvare@xxxxxxx>
> > > Suggested-by: Michal Hocko <mhocko@xxxxxxxx>
> > > Cc: Luis Chamberlain <mcgrof@xxxxxxxxxx>
> > > ---
> > > kernel/module/main.c | 4 +++-
> > > 1 file changed, 3 insertions(+), 1 deletion(-)
> > >
> > > --- linux-6.3.orig/kernel/module/main.c
> > > +++ linux-6.3/kernel/module/main.c
> > > @@ -850,7 +850,9 @@ void module_put(struct module *module)
> > > if (module) {
> > > preempt_disable();
> > > ret = atomic_dec_if_positive(&module->refcnt);
> > > - WARN_ON(ret < 0); /* Failed to put refcount */
> > > + WARN(ret < 0,
> > > + KERN_WARNING "Failed to put refcount for module %s\n",
> > > + module->name);
> >
> > Would it make sense to also print the refcnt here? In our internal bug
> > report it has turned out that this was an overflow (put missing) rather
> > than an underflow (too many put calls). Seeing the value could give a
> > clue about that. We had to configure panic_on_warn to capture a dump to
> > learn more which is rather impractical.
>
> Well, other calls to module_put() or try_module_get() could happen in
> parallel, so at the time we print refcnt, its value could be different
> from the one which triggered the WARN.

Racess with module_put should be impossible because all of them should
fail, right? Races with put are possible but we do not need an exact
value to tell the difference between over and underflow, no?
--
Michal Hocko
SUSE Labs