Re: kerneloops.org report for the week of June 14 2009

From: Ingo Molnar
Date: Tue Jun 23 2009 - 07:55:47 EST



* Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:

> On Sun, 14 Jun 2009, Arjan van de Ven wrote:
> > Rank 3: getnstimeofday (warning)
> > Reported 309 times (2446 total reports)
> > [suspend resume] getnstimeofday() is called before timekeeping is
> resumed
>
> > Rank 6: hres_timers_resume (warning)
> > Reported 188 times (1024 total reports)
> > [suspend resume] hres_timers_resume() is incorrectly called with
> > interrupts on
>
> Both have the same root cause. Something enables interrupts in the
> early resume path. IIRC, there was a culprit identified recently.
> Rafael ?

This can be debugged automatically today, using lockdep, by using a
'helper lock':

static DEFINE_PER_CPU(struct lockdep_map, helper_lock);

Then mark the lock irq-safe by doing something like:

static void mark_lock_irqsafe(void)
{
unsigned long flags;
int cpu;

local_irq_save(flags);
irq_enter(0);

for_each_online_cpu(cpu) {
lock_acquire(&per_cpu(helper_lock, cpu), 0, 0, 0, 0, NULL, 0);
lock_release(&per_cpu(helper_lock, cpu), 0, 0, 0, 0, NULL, 0);
}

irq_exit(0);
local_irq_restore(flags);
}

Then, the resume path, when it disables irqs, you can disallow
irq-enable via:

local_irq_disable();
lock_acquire(&__get_cpu_var(helper_lock), 0, 0, 0, 0, NULL, 0);
...
<extensive suspend or resume codepaths, callbacks>
...
lock_release(&__get_cpu_var(helper_lock), 0, 0, 0, 0, NULL, 0);
local_irq_enable();

And lockdep will warn if any function inbetween enables IRQs, by
emitting a splat about incorrectly enabled hardirqs. It will warn
about the specific place and will emit a relevant backtrace, - not
just the handler in general.

This should work just fine with current lockdep facilities.

Rafael?

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/