Re: WARNING in timer_wait_running

From: Marco Elver
Date: Tue Apr 11 2023 - 10:32:25 EST


On Fri, 7 Apr 2023 at 21:27, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> On Fri, Apr 07 2023 at 20:36, Frederic Weisbecker wrote:
>
> > On Fri, Apr 07, 2023 at 07:47:40PM +0200, Thomas Gleixner wrote:
> >> On Fri, Apr 07 2023 at 13:50, Frederic Weisbecker wrote:
> >> > On Fri, Apr 07, 2023 at 10:44:22AM +0200, Thomas Gleixner wrote:
> >> >> Now memory came back. The problem with posix CPU timers is that it is
> >> >> not really known to the other side which task is actually doing the
> >> >> expiry. For process wide timers this could be any task in the process.
> >> >>
> >> >> For hrtimers this works because the expiring context is known.
> >> >
> >> > So if posix_cpu_timer_del() were to clear ctmr->pid to NULL and then
> >> > delay put_pid() with RCU, we could retrieve that information without
> >> > holding the timer lock (with appropriate RCU accesses all around).
> >>
> >> No, you can't. This only gives you the process, but the expiry might run
> >> on any task of that. To make that work you need a mutex in sighand.
> >
> > Duh right missed that. Ok will try.
>
> But that's nasty as well as this can race against exec/exit and you can't
> hold sighand lock when acquiring the mutex.
>
> The mutex needs to be per task and held by the task which runs the
> expiry task work.
>
> Something like the completely untested below. You get the idea though.

I threw the existing reproducer at this, and it seems happy enough -
no warnings nor lockdep splats either.

Thanks,
-- Marco