Re: [PATCH 2/3] mm, notifier: Catch sleeping/blocking for !blockable

From: Daniel Vetter
Date: Fri Nov 23 2018 - 07:38:46 EST


On Fri, Nov 23, 2018 at 12:12:37PM +0100, Michal Hocko wrote:
> On Thu 22-11-18 17:51:05, Daniel Vetter wrote:
> > We need to make sure implementations don't cheat and don't have a
> > possible schedule/blocking point deeply burried where review can't
> > catch it.
> >
> > I'm not sure whether this is the best way to make sure all the
> > might_sleep() callsites trigger, and it's a bit ugly in the code flow.
> > But it gets the job done.
>
> Yeah, it is quite ugly. Especially because it makes DEBUG config
> bahavior much different. So is this really worth it? Has this already
> discovered any existing bug?

Given that we need an oom trigger to hit this we're not hitting this in CI
(oom is just way to unpredictable to even try). I'd kinda like to also add
some debug interface so I can provoke an oom kill of a specially prepared
process, to make sure we can reliably exercise this path without killing
the kernel accidentally. We do similar tricks for our shrinker already.

There's been patches floating with this kind of bug I think, and the call
chains we're dealing with a fairly deep. I don't trust review to reliably
catch this kind of fail, that's why I'm looking into tools to better
validat this stuff to augment review.

And yes it's ugly :-/

Wrt the behavior difference: I guess we could put another counter into the
task struct, and change might_sleep() to check it. All under
CONFIG_DEBUG_ATOMIC_SLEEP only ofc. That would avoid the preempt-disable
sideeffect. My worry with that is that people will spot it, and abuse it
in creative ways that do affect semantics. See horrors like
drm_can_sleep() (and I'm sure gfx folks are not the only ones who
seriously lacked taste here).

Up to the experts really how to best paint this shed I think.

Thanks, Daniel

>
> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> > Cc: Michal Hocko <mhocko@xxxxxxxx>
> > Cc: David Rientjes <rientjes@xxxxxxxxxx>
> > Cc: "Christian König" <christian.koenig@xxxxxxx>
> > Cc: Daniel Vetter <daniel.vetter@xxxxxxxx>
> > Cc: "Jérôme Glisse" <jglisse@xxxxxxxxxx>
> > Cc: linux-mm@xxxxxxxxx
> > Signed-off-by: Daniel Vetter <daniel.vetter@xxxxxxxxx>
> > ---
> > mm/mmu_notifier.c | 8 +++++++-
> > 1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
> > index 59e102589a25..4d282cfb296e 100644
> > --- a/mm/mmu_notifier.c
> > +++ b/mm/mmu_notifier.c
> > @@ -185,7 +185,13 @@ int __mmu_notifier_invalidate_range_start(struct mm_struct *mm,
> > id = srcu_read_lock(&srcu);
> > hlist_for_each_entry_rcu(mn, &mm->mmu_notifier_mm->list, hlist) {
> > if (mn->ops->invalidate_range_start) {
> > - int _ret = mn->ops->invalidate_range_start(mn, mm, start, end, blockable);
> > + int _ret;
> > +
> > + if (IS_ENABLED(CONFIG_DEBUG_ATOMIC_SLEEP) && !blockable)
> > + preempt_disable();
> > + _ret = mn->ops->invalidate_range_start(mn, mm, start, end, blockable);
> > + if (IS_ENABLED(CONFIG_DEBUG_ATOMIC_SLEEP) && !blockable)
> > + preempt_enable();
> > if (_ret) {
> > pr_info("%pS callback failed with %d in %sblockable context.\n",
> > mn->ops->invalidate_range_start, _ret,
> > --
> > 2.19.1
> >
>
> --
> Michal Hocko
> SUSE Labs

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch