intel_pstate_timer_func divide by zero oops

From: Parag Warudkar
Date: Wed Mar 27 2013 - 21:49:28 EST


I get this same oops occassionally - the machine freezes and there doesn't
seem to be any record of the oops on disk.

I captured it on camera -
https://lh3.googleusercontent.com/-K0lNbJrZBMQ/UVOU1vv1vvI/AAAAAAAANqI/pY92mWm3caE/s800/20130327_205245.jpg

If I am reading this right, it dies on this instruction -

0xffffffff8145792d <+349>: divq 0x18(%rcx)

>From the lst file that *seems* to be this inline function -

static inline void intel_pstate_calc_busy(struct cpudata *cpu,
struct sample *sample)
{
u64 core_pct;
sample->pstate_pct_busy = 100 - div64_u64(
ffffffff8145791d: 48 8b 41 20 mov 0x20(%rcx),%rax
ffffffff81457921: 48 8d 04 80 lea (%rax,%rax,4),%rax
ffffffff81457925: 48 8d 04 80 lea (%rax,%rax,4),%rax
ffffffff81457929: 48 c1 e0 02 shl $0x2,%rax
ffffffff8145792d: 48 f7 71 18 divq 0x18(%rcx)


That is -
sample->pstate_pct_busy = 100 - div64_u64(
sample->idletime_us * 100,
sample->duration_us);

So looks like sample->duration_us is 0? If so, that implies that
ktime_us_delta(now, cpu->prev_sample) is zero. I am not entirely sure how
to handle this case - return if sampling too early, or if there is some
other bug making the delta calculation go poof.


Thanks,

Parag
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/