Re: [Query] Preemption (hogging) of the work handler

From: Viresh Kumar
Date: Tue Jul 12 2016 - 09:12:12 EST


+Rafael and linux-pm to this thread :)

On 12-07-16, 14:52, Petr Mladek wrote:
> On Tue 2016-07-12 18:38:05, Sergey Senozhatsky wrote:
> > Hello,
> >
> > On (07/11/16 15:35), Viresh Kumar wrote:
> > [..]
> > > Sometimes, the platform doesn't come back after suspend. I have tried
> > > enabling no-console-suspend and the last line it prints is:
> > >
> > > Disabling non-boot CPUs
>
> I guess that the printk() kthread is not longer scheduled when there
> is only one CPU left.

Yeah, so I tried debugging this more and I am able to get printing
done to just before arch_suspend_disable_irqs() in suspend.c and then
it stops because of the async nature.

I get to this point for both successful suspend/resume (where system
resumes back successfully) and in the bad case (where the system just
hangs/crashes).

FWIW, I also tried commenting out following in suspend_enter():

error = suspend_ops->enter(state);

so that the system doesn't go into suspend at all, and just resume
back immediately (similar to TEST_CORE) and I saw the hang/crash then
as well one of the times.

> We might try to explicitly flush the consoles in suspend_console().

That wouldn't happen as I have disabled console-suspend.

> But I am not sure if we always want to do so because it might take
> a while. Also it need not help if someone already owns the
> console_sem. Note the console_unlock() calls the cond_resched()
> when in safe context.
>
> Well, we might do the best effort when no_console_suspend is enabled.

Hmm.. I have no reasoning yet on why the system comes to a complete
stop and a forceful reboot only makes it work :(

--
viresh