Re: RFC: Fix kdump failed with 'notsc'

From: Wei, Jiangang
Date: Mon Jun 27 2016 - 02:45:53 EST


Hi Alok,

Thanks for your reply.

CC Ingo and fenghua.yu

I found the arch-criminal by bisect.

[root@localhost linux]# git show 522e6646
commit 522e66464467543c0d88d023336eec4df03ad40b
Author: Fenghua Yu <fenghua.yu@xxxxxxxxx>
Date: Wed Oct 23 18:30:12 2013 -0700

x86/apic: Disable I/O APIC before shutdown of the local APIC

In reboot and crash path, when we shut down the local APIC, the I/O
APIC is
still active. This may cause issues because external interrupts
can still come in and disturb the local APIC during shutdown
process.

To quiet external interrupts, disable I/O APIC before shutdown local
APIC.

Signed-off-by: Fenghua Yu <fenghua.yu@xxxxxxxxx>
Link:
http://lkml.kernel.org/r/1382578212-4677-1-git-send-email-fenghua.yu@xxxxxxxxx
Cc: <stable@xxxxxxxxxx>
[ I suppose the 'issue' is a hang during shutdown. It's a fine
change nevertheless. ]
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>

When I revert it and disable local APIC before disabling IO-APIC in
native_machine_crash_shutdown(),
The kdump can finish without any error.

The commit 2885432 said 'it still makes sense to quiet IO APIC before
disabling Local APIC',
But I don't find any descriptions about the sequence of disable IO-APIC
and Local APIC
in "Intel 64 and IA-32 Architectures software developer's manual volume
3A".
Only erratum AVR31 for "Intel Atom Processor C2000 Product Family
Specification Update".

IMO,
It doesn't make sense that change the order of disabling IO APIC and
Local APIC just for a certain model C2000.

do you have any suggestion to fix it?
thanks in advance.

PSï
My machine is Lenovo's QiTianM4340,
and the CPU is Intel(R) Core(TM) i5-3470 CPU @ 3.20GHzï 4Cores.

Thanks,
wei

On Fri, 2016-06-24 at 10:41 +0000, Alok Kataria wrote:
> Hi Wei,
>
> On Tue, 2016-06-14 at 09:56 +0000, Wei, Jiangang wrote:
> > Hi,
> >
> > When I trigger kernel crash and specify 'notsc' for capture-kernel,
> > The process of kdump will be blocked at calibrate_delay_converge().
> >
> > /* wait for "start of" clock tick */
> > ticks = jiffies;
> > while (ticks == jiffies)
> > ; /* nothing */
> >
> > The reason is that the jiffies remains the same, no changed.
> >
> > serial console log as following,
> > ............
> > [ 0.000000] Linux version 4.7.0-rc2+ (root@xxxxxxxxxxxxxxxxxxxxx)
> > (gcc version 4.8.2 20140120 (Red Hat 4.8.2-16) (GCC) ) #2 SMP Wed Jun
> > 156
> > [ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-4.7.0-rc2+
> > root=/dev/mapper/centos-root ro rd.lvm.lv=centos/swap
> > vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos/root crashkernel=256M
> > vconsole.keymap=us console=tty0 console=ttyS0,115200n8 LANG=en_US.UTF-8
> > irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off
> > panic=10 rootflags=nofail acpi_no_memhotplug notsc
> > ............
> > [ 0.000000] tsc: Kernel compiled with CONFIG_X86_TSC, cannot disable
> > TSC completely
> > ............
> > [ 0.000000] clocksource: hpet: mask: 0xffffffff max_cycles:
> > 0xffffffff, max_idle_ns: 133484882848 ns
> > [ 0.000000] tsc: Fast TSC calibration using PIT
> > [ 0.000000] tsc: Detected 3192.714 MHz processor
> > [ 0.000000] Calibrating delay loop...
> >
> > # The last log is raised by calibrate_delay(), which calls
> > calibrate_delay_converge() to compute the lpj value.
> >
> > # So far, I don't know why the jiffies stays the same.
> > # But I found two methods can avoid this problemã
> >
> > 1ïspecify the 'lpj=<n>' with 'notsc' together.
> >
> > 2) revert the 70de9a9.
> >
> > commit 70de9a97049e0ba79dc040868564408d5ce697f9
> > Author: Alok Kataria <akataria@xxxxxxxxxx>
> > Date: Mon Nov 3 11:18:47 2008 -0800
> >
> > x86: don't use tsc_khz to calculate lpj if notsc is passed
> >
> > Impact: fix udelay when "notsc" boot parameter is passed
> >
> > With notsc passed on commandline, tsc may not be used for
> > udelays, make sure that we do not use tsc_khz to calculate
> > the lpj value in such cases.
> >
> > IMO,
> > The flow of getting tsc_khz as following,
> > tsc_init()->x86_platform.calibrate_tsc()->native_calibrate_tsc()->quick_pit_calibrate().
> > No codes use or call 'rdtsc'.
>
> The intent of that change was to skip calculating the lpj value based on
> the tsc_khz value if notsc is specified. Note that it has noting to do
> with using rdtsc for tsc frequency calibration, instead we use the tsc
> frequency (tsc_khz) derived lpj value for udelay (see delay_tsc).
>
> If notsc is passed, we skip assigning a value to lpj_fine since tsc is
> no longer used for implementing delay. Instead we now calibrate lpj
> value in calibrate_delay and call calibrate_delay_converge. Now looking
> at calibrate_delay_converge, it expects jiffies to advance. Otherwise
> you will wait endlessly there
>
> static unsigned long calibrate_delay_converge(void)
> {
> ...
> /* wait for "start of" clock tick */
> ticks = jiffies;
> while (ticks == jiffies)
> ; /* nothing */
>
> You should really look at why is jiffies not incrementing.
>
> >
> > Even if ânotscâ is passed, the tsc_khz is credible.
> > and we can get lpj by it.
> >
> > So I want to push a patch to revert the 70de9a9.
> > Any comments or suggestions is appreciated.
>
> As mentioned above reverting change 70de9a9 is wrong and would be just
> papering over the actual issue.
>
> Thanks,
> Alok
>
> >
> > Thanks,
> > wei
> >
> >
> >
> >
> >
>
>
>