Re: [MIPS]clocks_calc_mult_shift() may gen a too big mult value

From: John Stultz
Date: Mon Oct 31 2011 - 14:13:06 EST


On Mon, 2011-10-31 at 21:59 +0800, zhangfx wrote:
> > Could you annotate clocks_calc_mult_shift() a little bit to see where
> > things might be going wrong?
> Let me give some real world data:
> in one machine with 500MHz freq,
> the calculated freq = 500084016, and clocks_calc_mult_shift() give
> mult = 4294245725
> shift = 30
>
> but in the 1785th call to update_wall_time, due to error correction
> algorithm, the mult become 4293964632,
> in next update_wall_time, the ntp_error is 0x301c93b7927c, which lead to
> an adj of 20, then mult is overflow:
> mult = 4293964632 + (1<<20) = 45912
> with this mult, if anyone call timekeeping_get_ns or others using mult,
> the time concept will be extremely wrong, so some sleep will
> (almost)never return => virtually hang
>
> We are not abosulately sure that the error source is normal, but anyway
> it is a possible for the code to overflow, and it will cause hang.
>
> For this case, the timekeeping_bigadjust should be able to control adj
> to a maximum of around 20 with the lookahead for any error. So if the
> mult is chosen at shift = 29, then mult becomes 4294245725/2, it will
> not be possible to be overflowed.
>
> In short, choosing a mult close to 2^32 is dangerous. But I don't know
> what's the best way to avoid it for general cases, because I don't know
> how big error can be and the adj can be for different systems.

Ah. Ok, sorry for being so slow to understand.

So yea, I think you're right that the issue seems to be that for certain
feq values, the resulting mult,shift pair is optimized a little too far,
and we don't leave enough room for ntp adjustments to mult, without the
possibility of overflowing mult.

The following patch should handle it (although I'm at a conf right now,
so I couldn't test it), although I might be trying to be too smart here.
Rather then just checking if mult is larger then 0xf0000000, we try to
quantify the largest valid mult adjustment, and then make sure mult +
that value doesn't overflow. NTP limits adjustments to 500ppm, but the
kernel might have to deal with larger internal adjustments. Probably we
could be safe ruling out larger then 5% adjustments.

So then its just a matter of 1/2^4. So the largest mult adjustment
should be 1 << (cs->shift - 4)

Does this seem reasonable? Let me know if this seems to work for you.

Thomas: does this fix look like its in a reasonable spot? I don't want
to clutter up the calc_mult_shift() code up since this really only
applies to clocksources and not clockevents.

NOT TESTED & NOT FOR INCLUSION (YET)
Signed-off-by: John Stultz <john.stultz@xxxxxxxxxx>
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index cf52fda..73518d2 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -640,7 +640,7 @@ static void clocksource_enqueue(struct clocksource *cs)
void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)
{
u64 sec;
-
+ u32 maxadj;
/*
* Calc the maximum number of seconds which we can run before
* wrapping around. For clocksources which have a mask > 32bit
@@ -661,6 +661,22 @@ void __clocksource_updatefreq_scale(struct clocksource *cs, u32 scale, u32 freq)

clocks_calc_mult_shift(&cs->mult, &cs->shift, freq,
NSEC_PER_SEC / scale, sec * scale);
+
+ /*
+ * Since mult may be adjusted by ntp, add an extra saftey margin
+ * for clocksources that have large mults, to avoid overflow.
+ *
+ * Assume we won't try to correct for more then 5% adjustments
+ * (50,000 ppm), which approximates to 1/16 or 1/2^4.
+ * Thus 1 << (shift - 4) is the largest mult adjustment we'll
+ * support.
+ */
+ maxadj = 1 << (shift-4);
+ if ((cs->mult + maxadj < cs->mult) || (cs->mult - maxadj > cs->mult)) {
+ cs->mult >>= 1;
+ cs->shift--;
+ }
+
cs->max_idle_ns = clocksource_max_deferment(cs);
}
EXPORT_SYMBOL_GPL(__clocksource_updatefreq_scale);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/