Regression: unable to boot after commit bd9240a18edf ("x86/apic: Add TSC_DEADLINE quirk due to errata") - Surface Pro 4 SKL

From: Zhang Rui
Date: Mon Nov 27 2017 - 21:09:02 EST


Hi, All,

My Surface Pro 4 is unable to boot after 4.12. The symptom is that
kernel freezes during boot, and the last message in the screen is
loading the initrd image. And I have bisected it to this commit

commit bd9240a18edfbfa72e957fc2ba831cf1f13ea073 (refs/bisect/bad)
Author:ÂÂÂÂÂPeter Zijlstra <peterz@xxxxxxxxxxxxx>
AuthorDate: Wed May 31 17:52:03 2017 +0200
Commit:ÂÂÂÂÂThomas Gleixner <tglx@xxxxxxxxxxxxx>
CommitDate: Sun Jun 4 21:55:53 2017 +0200

ÂÂÂÂx86/apic: Add TSC_DEADLINE quirk due to errata
ÂÂÂÂ
ÂÂÂÂDue to errata it is possible for the TSC_DEADLINE timer to
misbehave
ÂÂÂÂafter using TSC_ADJUST. A microcode update is available to fix this
ÂÂÂÂsituation.
ÂÂÂÂ
ÂÂÂÂAvoid using the TSC_DEADLINE timer if it is affected by this issue
and
ÂÂÂÂreport the required microcode version.
ÂÂÂÂ
ÂÂÂÂ[ tglx: Renamed function to apic_check_deadline_errata() ]
ÂÂÂÂ
ÂÂÂÂSigned-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
ÂÂÂÂCc: kevin.b.stanton@xxxxxxxxx
ÂÂÂÂLink: http://lkml.kernel.org/r/20170531155306.050849877@xxxxxxxxxxx
rg
ÂÂÂÂSigned-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>

Currently, I'm using v4.14 kernel with the following workaround on top.

---
Âarch/x86/kernel/apic/apic.c | 2 +-
Â1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index ff89177..cd419d4 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -596,7 +596,7 @@ static const struct x86_cpu_id deadline_match[] = {
 DEADLINE_MODEL_MATCH_REV ( INTEL_FAM6_BROADWELL_CORE, 0x25),
 DEADLINE_MODEL_MATCH_REV ( INTEL_FAM6_BROADWELL_GT3E, 0x17),
Â
- DEADLINE_MODEL_MATCH_REV ( INTEL_FAM6_SKYLAKE_MOBILE, 0xb2),
+// DEADLINE_MODEL_MATCH_REV ( INTEL_FAM6_SKYLAKE_MOBILE, 0xb2),
 DEADLINE_MODEL_MATCH_REV ( INTEL_FAM6_SKYLAKE_DESKTOP, 0xb2),
Â
 DEADLINE_MODEL_MATCH_REV ( INTEL_FAM6_KABYLAKE_MOBILE, 0x52),
--Â
2.7.4

I'm reading the related code but have not figured out the root cause yet.
Anyone suggestions?

thanks,
rui