Re: [PATCH] x86, tsc: Skip refined tsc calibration on systems withreliable TSC.

From: john stultz
Date: Tue Mar 06 2012 - 22:26:34 EST


On Tue, 2012-03-06 at 18:05 -0800, Alok Kataria wrote:
> Hi John,
>
> On Tue, 2012-03-06 at 17:32 -0800, john stultz wrote:
> > On Tue, 2012-02-21 at 18:19 -0800, Alok Kataria wrote:
> > > [Oops forgot to copy LKML, now it is, sorry for the duplicates]
> > >
> > > While running the latest Linux as guest under VMware in highly
> > > over-committed situations, we have seen cases when the refined TSC
> > > algorithm fails to get a valid tsc_start value in
> > > tsc_refine_calibration_work from multiple attempts. As a result the
> > > kernel keeps on scheduling the tsc_irqwork task for later. Subsequently
> > > after several attempts when it gets a valid start value it goes through
> > > the refined calibration and either bails out or uses the new results.
> > > Given that the kernel originally read the TSC frequency from the
> > > platform, which is the best it can get, I don't think there is much
> > > value in refining it.
> > >
> > > So IMO, for systems which get the TSC frequency from the platform we
> > > should skip the refined tsc algorithm.
> > >
> > > We can use the TSC_RELIABLE cpu cap flag to detect this, right now it is
> > > set only on VMware and for Moorestown Penwell both of which have there
> > > own TSC calibration methods.
> >
> > So this looks ok to me, only one nit below...
> >
> > >
> > > Index: linux-2.6/arch/x86/kernel/tsc.c
> > > ===================================================================
> > > --- linux-2.6.orig/arch/x86/kernel/tsc.c 2012-02-21 17:31:01.000000000 -0800
> > > +++ linux-2.6/arch/x86/kernel/tsc.c 2012-02-21 17:39:05.000000000 -0800
> > > @@ -874,6 +874,13 @@ static void tsc_refine_calibration_work(
> > > goto out;
> > >
> > > /*
> > > + * Trust the results of the earlier calibration on systems
> > > + * exporting a reliable TSC.
> > > + */
> > > + if (boot_cpu_has(X86_FEATURE_TSC_RELIABLE))
> > > + goto out;
> > > +
> > > + /*
> >
> > Instead of dropping out in the function called by the work-queue, why
> > not just avoid scheduling the work-queue to begin with?
> >
> > The FEATURE_TSC_RELIABLE isn't something that is set late, and needs the
> > delay, right?
>
> Right, but the reason I did it as part of work-queue was because on the
> "out" path it still registered the tsc clocksource for us, the patch you
> suggested doesn't do that. Please find below a patch on similar lines,
> which registers the clocksource on RELIABLE_TSC systems, instead of
> relying on irq_work to do that.

Ah, yes. Thanks for pointing that out!

I'll go ahead and queue your revision.

thanks
-john


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/