Re: [tip:timers/urgent] tick: Fix oneshot broadcast setup really

From: Suresh Siddha
Date: Wed Apr 18 2012 - 22:25:17 EST


On Wed, 2012-04-18 at 17:31 +0200, Thomas Gleixner wrote:
> On Wed, 18 Apr 2012, Santosh Shilimkar wrote:
> > >> if (was_periodic && !cpumask_empty(to_cpumask(tmpmask))) {
> > >> + clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT);
> >
> > For some reason above if() check fails in my case, so broadcast device
> > never set to ONESHOT mode. That explains the problem I am seeing on
> > OMAP with the $subject patch. At this point of time bc->mode is
> > CLOCK_EVT_MODE_UNUSED.
>
> Darn, crap. I wonder how that works on x86
>
> diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
> index bf57abd..e8f5479 100644
> --- a/kernel/time/tick-broadcast.c
> +++ b/kernel/time/tick-broadcast.c
> @@ -373,6 +373,9 @@ static int tick_broadcast_set_event(ktime_t expires, int force)
> {
> struct clock_event_device *bc = tick_broadcast_device.evtdev;
>
> + if (bc->mode != CLOCK_EVT_MODE_ONESHOT)
> + clockevents_set_mode(bc, CLOCK_EVT_MODE_ONESHOT);
> +
> return clockevents_program_event(bc, expires, force);
> }
>

So Here is my understanding of the issue why only Sven saw the resume
issue and what Santosh has seen with the first fix that Thomas tried.

>From the review, most likely Sven's system we are force enabling the
hpet using the pci quirk's method very late. And in this case,
hpet_clockevent (which will be global_clock_event) handler can be null,
specifically as this platform might not be using deeper c-states and
using the reliable APIC timer.

Prior to commit 'fa4da365bc7772c', that handler will be set to
'tick_handle_oneshot_broadcast' when we switch the broadcast timer to
oneshot mode, even though we don't use it. Post commit
'fa4da365bc7772c', we stopped switching the broadcast mode to oneshot as
this is not really needed and his platform's global_clock_event's
handler will remain null. While on my SNB laptop, same is set to
'clockevents_handle_noop' because hpet gets enabled very early. (noop
handler on my platform set when the early enabled hpet timer gets
replaced by the lapic timer).

But the commit 'fa4da365bc7772c' tracked the broadcast timer mode in the
SW as oneshot, even though it didn't touch the HW timer. During resume
however, tick_resume_broadcast() saw the SW broadcast mode as oneshot
and actually programmed the broadcast device also into oneshot mode. So
this triggered the null pointer de-reference after the hpet wraps around
and depending on what the hpet counter is set to. On the normal
platforms where hpet gets enabled early we should be seeing a spurious
interrupt (in my SNB laptop I see one spurious interrupt after around 5
minutes ;) which is 32-bit hpet counter wraparound time).

So thomas even with your current proposed fix, we should address this
spurious interrupt once in 5 minutes after resume!

And now coming to the Santosh's bc mode in not used mode, Thomas in
tick_check_broadcast_device() we do this.

if (!cpumask_empty(tick_get_broadcast_mask()))
tick_broadcast_start_periodic(dev);

Typically during boot on a regular platform, broadcast mask is NULL,
resulting in the bc timer mode as 'CLOCK_EVT_MODE_UNUSED'. This is the
case even on regular x86.

In essence, feel free to add my
"Acked-by: Suresh Siddha <suresh.b.siddha@xxxxxxxxx>" to your updated
fix.

Also, please review and consider the appended patch which addresses the
spurious timer interrupt every 5 minutes up on resume in the platforms
where broadcast timer is really not used. Thanks.
---
From: Suresh Siddha <suresh.b.siddha@xxxxxxxxx>
Subject: tick: Fix the spurious broadcast timer ticks

During resume, tick_resume_broadcast() programs the broadcast timer
in oneshot mode unconditionally. On the platforms where broadcast timer
is not really required, this will generate spurious broadcast timer ticks
upon resume. For example, on the always running apic timer platforms with
HPET, I see spurious hpet tick once every ~5minutes (which is the 32-bit
hpet counter wraparound time).

Similar to boot time, during resume make the oneshot mode setting of
the broadcast clock event device conditional on the state of active broadcast
users.

Signed-off-by: Suresh Siddha <suresh.b.siddha@xxxxxxxxx>
---
kernel/time/tick-broadcast.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index bf57abd..766cd82 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -346,7 +346,8 @@ int tick_resume_broadcast(void)
tick_get_broadcast_mask());
break;
case TICKDEV_MODE_ONESHOT:
- broadcast = tick_resume_broadcast_oneshot(bc);
+ if (!cpumask_empty(tick_get_broadcast_mask()))
+ broadcast = tick_resume_broadcast_oneshot(bc);
break;
}
}


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/