Re: [PATCH v2] PM / Sleep: Timer quiesce in freeze state

From: Li, Aubrey
Date: Thu Nov 13 2014 - 05:52:23 EST


On 2014/11/13 17:19, Thomas Gleixner wrote:
> On Thu, 13 Nov 2014, Li, Aubrey wrote:
>> On 2014/11/13 9:37, Peter Zijlstra wrote:
>>> On Wed, Nov 12, 2014 at 10:09:47PM +0100, Thomas Gleixner wrote:
>>>> On Thu, 30 Oct 2014, Li, Aubrey wrote:
>>>>
>>>>> Freeze is a general power saving state that processes are frozen, devices
>>>>> are suspended and CPUs are in idle state. However, when the system enters
>>>>> freeze state, there are a few timers keep ticking and hence consumes more
>>>>> power unnecessarily. The observed timer events in freeze state are:
>>>>> - tick_sched_timer
>>>>> - watchdog lockup detector
>>>>> - realtime scheduler period timer
>>>>>
>>>>> The system power consumption in freeze state will be reduced significantly
>>>>> if we quiesce these timers.
>>>>
>>>> So the obvious question is why dont we quiesce these timers by telling
>>>> the subsystems which manage these timers to shut them down?
>>>>
>>>> I really want a proper answer for this in the first place, but let me
>>>> look at the proposed "solution" as well.
>>>
>>> Two arguments here:
>>>
>>> 1) the current suspend modes don't care, so if this suspend mode starts
>>> to care, its likely to 'break' in the future simply because people
>>> never cared about timers.
>>>
>>> 2) there could be userland timers, userland is frozen but they'll still
>>> have their timers set and those can and will fire.
>>>
>>> But sure, we can add suspend notifiers to stuff to shut down timers; I
>>> should have a patch for at least one of the offenders somewhere. But I
>>> really think that we should not be looking at the individual timers for
>>> this, none of the other suspend modes care about active timers.
>>>
>>>> But before we do that we want a proper explanation why the interrupt
>>>> fires at all. The lack of explanation cleary documents that this is a
>>>> 'hacked it into submission' approach.
>>>
>>> >From what I remember its the waking interrupt that ends up in the
>>> timekeeping code, Li should have a backtrace somwhere.
>>
>> There are two race conditions:
>>
>> The first one occurs after the interrupt is disabled and before we
>> suspend lapic. In this time slot, if apic timer interrupt occurs, the
>> interrupt is pending there because the interrupt is disabled. Then we
>> suspend timekeeping, and then we enter idle and exit idle with interrupt
>> re-enabled, the timer interrupt is handled with timekeeping is
>> suspended.
>>
>> The other occurs after timekeeping_suspended = 1 and before we suspend
>> lapic. In this time slot, if apic timer interrupt occurs, we invoke the
>> timer interrupt while timekeeping is suspended as above.
>
> And that race exists for every implementation and is not at all apic
> timer specific. So we fix it at the core and not at some random place
> in the architecture code.
>
You're right, will refine this in the next patch version.

Thanks,
-Aubrey
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/