Re: Suspend resume problem (WAS Re: [ANNOUNCE] 3.8.10-rt6)

From: Sebastian Andrzej Siewior
Date: Fri May 03 2013 - 05:59:41 EST


On 04/30/2013 08:08 PM, Steven Rostedt wrote:
>> This NMI releated deadlock is a problem which should also trigger
>> mainline, right?
>
> Well, yeah, as sending out a NMI stack dump is sorta the last resort,
> and is dangerous to do printks from NMI context.

So we did bad and we upgrade to bad and dangerous.

>>
>> Now, the time jump on the other hand is the real issue here and is
>> RT-only. It looks like we get a big number of timer updates via
>> tick_do_update_jiffies64() because according to ktime_get() that much
>> time really passed by.
>
> As the NMI dump only happens because of the time jump, which as you
> said, is -rt only, I wouldn't say that the NMI deadlock is a mainline
> bug.

The reason for the NMI was a bug in the -RT tree but if something else
triggers that NMI we have a good chance to deadlock.

What about a try_lock() and leave after 50 usecs of trying and not
getting it in the in_nmi() case?

> -- Steve

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/