Re: [RFC] rcu: use killable versions of swait

From: Paul E. McKenney
Date: Thu Jun 15 2017 - 12:22:13 EST


On Thu, Jun 15, 2017 at 05:50:39PM +0200, Luis R. Rodriguez wrote:
> On Wed, Jun 14, 2017 at 04:43:45PM -0700, Paul E. McKenney wrote:
> > On Wed, Jun 14, 2017 at 04:06:39PM -0700, Luis R. Rodriguez wrote:
> > > These waits don't even check for the return value for interruption,
> > > using the non-killable variants means we could be killed by other
> > > signals than SIGKILL, this is fragile.
> >
> > A number of people asserted that kthreads could never catch signals.
> > I checked this at the time, and it seemed like this was the case, and
> > the call to ignore_signals() seems to confirm this.
> >
> > So it looks to me like RCU doesn't care about this change (give or
> > take any affect on the load average, anyway), but there is much that I
> > don't know about Linux-kernel signal handling, so please feel free to
> > educate me.
>
> Thanks, I had seen the kthread but figured best to ask, just got into parnaoia
> mode. If we were to do a sanity check for usage we'd then have to white list
> when kthreads are used, however since we don't care to be interrupted why not
> use a wait which is also explicit about our current uninterruptible state?

I do appreciate any and all inspection, actually, so thank you!

I used to have it uninterruptable, but got complaints on the effect on
the load average -- sysadms didn't like the "D" state and the fact that
the load average was always greater than 2 (or 3 on PREEMPT=y kernels)
when the system was completely idle.

> > > Signed-off-by: Luis R. Rodriguez <mcgrof@xxxxxxxxxx>
> > > ---
> > >
> > > The killable swaits were just posted [1] as part of a series where SIGCHLD
> > > was detected as interrupting and killing kernel calls waiting using
> > > non-killable swaits [1]. The fragility here made curious about other callers
> > > and seeing if they really meant to use such broad wait which captures a lot
> > > of signals.
> > >
> > > I can't see why we'd want to have these killed by other signals, specialy
> > > since it seems we don't even check for the return value... Granted to abort
> > > properly we'd have to check for the return value for -ERESTARTSYS, but yeah,
> > > none of this is done, so it would seem we don't want fragile signals
> > > interrupting these ?
> >
> > The later WARN_ON(signal_pending(current)) complains if a signal somehow
> > makes it to this task. Assuming that the signal is nonfatal, anyway.
>
> I see, how about just using swait_event_timeout() and removing the WARN_ON()?
> Is there a reason for having the interruptible ?

If sleeping-uninterruptible kthreads are now excluded from the load average,
no reason. But if sleeping-uninterruptible kthreads are still included in
the load average, it must stay interruptible.

> > > Also can someone confirm if the original change of to swait_event_timeout()
> > > from wait_event_interruptible_timeout() was actually intentional on
> > > synchronize_sched_expedited_wait() on commit abedf8e2419fb ("rcu: Use simple
> > > wait queues where possible in rcutree") ? I can't easily confirm.
> >
> > This is also called from a workqueue (at least once the core_initcall()s
> > get done), which again should mean no signals. A WARN_ON(ret < 0)
> > complains if this ever changes. So it should be OK that this is
> > swait_event_timeout(). And expedited grace periods are supposed to
> > get done quickly, so effect on the load average should be negligible.
>
> Great thanks.
>
> > Or am I missing something?
>
> No, just got into parnoia mode and better to ask than be sorry later!

Indeed, better safe that sorry!

Thanx, Paul