Re: [patch 0/2] tsc/adjust: Cure suspend/resume issues and prevent TSC deadline timer irq storm

From: Thomas Gleixner
Date: Tue Dec 13 2016 - 11:50:29 EST


On Tue, 13 Dec 2016, Roland Scheidegger wrote:

> Am 13.12.2016 um 14:14 schrieb Thomas Gleixner:
> > Roland reported interesting TSC ADJUST register wreckage on his DELL
> > machine, which seems to populate that MSR with a random number generator.
>
> FWIW, I thought about the actual values some more and I don't actually
> think they are all that random any more: the behavior is consistent with
> the bios trying to zero the TSC of all cpus. If I understand this right,
> writing a zero to TSC would cause somewhat small negative values in the
> TSC_ADJ register at boot time, and larger negative values at suspend
> time (at least if the TSC just stops when suspended and isn't reset) -
> exactly what I'm seeing.
> (And of course the different TSC_ADJ values would be because the bios is
> writing TSC without any thoughts of synchronization, just one cpu after
> another).

Yeah, that might be. Still it looks like random nonsense and definitely the
BIOS developers did not follow the secrit boot protocol.

> > Deeper investagation into fixing this wreckage unearthed another special
> > feature which is designed by Intel: Negative TSC adjuste values cause
> > interrupt storms on the TSC deadline timer. Further details in patch 2/2
>
> This actually looks like quite a serious hw bug to me, shouldn't there
> be an errata for such a bug?
>
> And I still don't quite understand why the lockup doesn't happen after a
> warm boot, there must be something different there...

What are the adjust values after a warm boot?

Thanks,

tglx