Re: perf sched record hangs machine

From: Ingo Molnar
Date: Wed Sep 23 2009 - 05:20:46 EST



* Chris Malley <mail@xxxxxxxxxxxxxxxxx> wrote:

> 2009/9/23 Cyrill Gorcunov <gorcunov@xxxxxxxxx>:
> >
> > Btw, meanwhile Chris may try to pass lapic boot-option in attempt to
> > reenable apic via msr registers. Also (iirc) i feel we may be hiding
> > errors if complete noop apic would be used since i belive we need to
> > check out under which condition a particular operation is called and
> > when apic is disabled it's mean we're switched to UP mode and
> > inter-cpu interrupts are under suspicion too. Will take a look during
> > ~6 hours ;)
> >
>
> Hi Cyrill
>
> Heh, yes that just occurred to me as well. With the lapic boot option
> I can't reproduce the problem, and get a good recording every time.
> Don't know why the BIOS had disabled it (can't see any specific
> option).

Would still be important to fix the crash - there are boxes where lapics
are disabled permanently and cannot be re-enabled. (plus most people
dont touch their defaults and dont add funky boot options - so crashing
is not an option)

I have such a test-box:

[ 0.000000] Using APIC driver default
[ 0.000000] ACPI: PM-Timer IO Port: 0x8008
[ 0.000000] SMP: Allowing 1 CPUs, 0 hotplug CPUs
[ 0.000000] Local APIC disabled by BIOS -- reenabling.
[ 0.000000] Could not enable APIC!
[ 0.000000] APIC: disable apic facility

Btw., perf events can work even without a lapic (albeit without NMI
driven sampling):

[ 0.052051] Performance Events:
[ 0.055138] no APIC, boot with the "lapic" boot parameter to force-enable it.
[ 0.056014] no hardware sampling interrupt available.
[ 0.060014] p6 PMU driver.
[ 0.062955] ... version: 0
[ 0.064014] ... bit width: 32
[ 0.068014] ... generic registers: 2
[ 0.072015] ... value mask: 00000000ffffffff
[ 0.076014] ... max period: 000000007fffffff
[ 0.080014] ... fixed-purpose events: 0
[ 0.084014] ... event mask: 0000000000000003

That's what it did on your box too:

[ 0.013679] Performance Events:
[ 0.013705] no APIC, boot with the "lapic" boot parameter to force-enable it.
[ 0.013783] no hardware sampling interrupt available.
[ 0.013826] p6 PMU driver.
[ 0.013882] ... version: 0
[ 0.013922] ... bit width: 32
[ 0.013962] ... generic registers: 2
[ 0.014002] ... value mask: 00000000ffffffff
[ 0.014045] ... max period: 000000007fffffff
[ 0.014088] ... fixed-purpose events: 0
[ 0.014128] ... event mask: 0000000000000003

Unfortunately i cannot reproduce the crash you've been seeing. (but i'm
quite sure it's due to self-IPI not working fine with dummy lapic.)

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/