Re: WARNING: syz-executor still has locks held!

From: Oleg Nesterov
Date: Wed Mar 20 2019 - 13:30:49 EST


On 03/20, Michal Hocko wrote:
>
> [Cc Ingo and Chanho Min - the thread starts here
> http://lkml.kernel.org/r/0000000000004cdec6058485b2ce@xxxxxxxxxx]
>
> On Wed 20-03-19 16:00:54, Oleg Nesterov wrote:
> > On 03/20, Michal Hocko wrote:
> > >
> > > On Wed 20-03-19 14:24:11, Oleg Nesterov wrote:
> > > > On 03/20, Michal Hocko wrote:
> > > > >
> > > > > Yes we do hold the cgred mutex while calling freezable_schedule but why
> > > > > are we getting a warning is not really clear to me. The task should be
> > > > > hidden from the freezer so why do we warn at all?
> > > >
> > > > try_to_freeze() calls debug_check_no_locks_held() and this makes sense.
> > >
> > > Yes it does. But it already ignores PF_NOFREEZE tasks and I fail to see
> > > why is PF_FREEZER_SKIP any different.
> >
> > But they differ. PF_NOFREEZE is a "sticky" flag for kthreads. Set by default,
> > cleared by set_freezable() if you want a freezable kthread.
> >
> > PF_FREEZER_SKIP means that a sleeping freezable task will call try_to_freeze()
> > right after schedule() returns, so try_to_freeze_tasks() can safely count it as
> > "already frozen".
>
> But the fundamental semantic is the same right? Both might be sitting on
> locks that might interfere with other tasks and we should be _extra_
> careful when using them. In an ideal world, none of them is really
> needed.

Ah, it seems that we misunderstood each other... see below.

> So my question remains. Can we drop the warning for PF_FREEZER_SKIP
> tasks as well?

But why? It is obviously wrong to call try_to_freeze() with a lock held.

Probably you meant the

if (!(current->flags & PF_NOFREEZE))

check in try_to_freeze() when you said "already ignores PF_NOFREEZE tasks".

I am not sure we actually need this check, a PF_NOFREEZE kthread shouldn't
call try_to_freeze() at least directly.

However, note that freezing() will return false if PF_NOFREEZE is set, so
try_to_freeze() is nop in this case. Probably this is why PF_NOFREEZE is
also checked before debug_check_no_locks_held().

> > > as removing the cgred is way way too complicated.
> >
> > We need to do this anyway, this leads to other more serious problems...
>
> Yes but this is far away and it doesn't really seem like a stable tree
> material

strace -f can hang ;) so this is the stable material.

Oleg.