Re: 2.6.20-git13 kernel BUG at /mnt/md0/devel/linux-git/kernel/time/tick-sched.c:168

From: Michal Piotrowski
Date: Wed Feb 21 2007 - 07:19:37 EST


Michal Piotrowski napisaÅ(a):
> On 17/02/07, Alex Riesen <fork0@xxxxxxxxxxxxxxxxxxxxx> wrote:
>> Thomas Gleixner, Sat, Feb 17, 2007 16:14:17 +0100:
>> > On Sat, 2007-02-17 at 15:47 +0100, Alex Riesen wrote:
>> > > > > 164 if (need_resched())
>> > > > > 165 goto end;
>> > > > > 166
>> > > > > 167 cpu = smp_processor_id();
>> > > > > 168 BUG_ON(local_softirq_pending());
>> > > >
>> > > > Hmm, the BUG_ON is inside of an interrupt disabled region, so we
>> should
>> > > > have bailed out early in the need_resched() check above (because
>> we are
>> > > > in the idle task context according to the stack trace).
>> > > >
>> > > > Is this reproducible ?
>> > >
>> > > Seen this too (Ubuntu, P4/ht-SMT, SATA, typed from screen):
>> >
>> > Can you please apply the patch below, so we can at least see, which
>> > softirq is pending. This should trigger independently of hrtimers and
>> > dynticks. You can keep it compiled in and disable it at the kernel
>> > commandline with "nohz=off" and / or "highres=off"
>>
>> It did, only one time:
>>
>> Idle: local softirq pending: 0020<6>USB Universal Host Controller
>> Interface driver v3.0
>>
>
> sudo cat /var/log/messages | grep Idle
> Feb 17 17:35:34 bitis-gabonica kernel: Idle: local softirq pending:
> 0020<6>hdd: ATAPI 48X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache,
> UDMA(33)
> Feb 17 19:20:01 bitis-gabonica kernel: Idle: local softirq pending:
> 0020<3>Idle: local softirq pending: 0020<3>Idle: local softirq
> pending: 0020<7>PM: Removing info for No Bus:vcs7
>
> cat /proc/interrupts
> CPU0 CPU1
> 0: 232838 0 IO-APIC-edge timer
> 1: 401 0 IO-APIC-edge i8042
> 7: 0 0 IO-APIC-edge parport0
> 8: 1 0 IO-APIC-edge rtc
> 9: 1 0 IO-APIC-fasteoi acpi
> 12: 4 0 IO-APIC-edge i8042
> 14: 319 0 IO-APIC-edge ide0
> 15: 1612 0 IO-APIC-edge ide1
> 16: 16494 0 IO-APIC-fasteoi uhci_hcd:usb3, libata
> 17: 4670 0 IO-APIC-fasteoi uhci_hcd:usb1, uhci_hcd:usb4
> 18: 0 0 IO-APIC-fasteoi uhci_hcd:usb2
> 19: 2 0 IO-APIC-fasteoi ehci_hcd:usb5
> 20: 2822 0 IO-APIC-fasteoi Intel ICH5
> 21: 2881 0 IO-APIC-fasteoi eth1
> 22: 58 0 IO-APIC-fasteoi eth0
> NMI: 0 0
> LOC: 232562 232846
> ERR: 0
> MIS: 0
>
> I can confirm that this is ICH5 SATA controller problem.

Here is something interesting

cat /var/log/messages | tail -n 300 | grep NOHZ
Feb 20 20:09:39 bitis-gabonica kernel: NOHZ: local_softirq_pending 20
Feb 20 20:09:39 bitis-gabonica kernel: NOHZ: local_softirq_pending 20
Feb 20 21:10:01 bitis-gabonica kernel: NOHZ: local_softirq_pending 20
Feb 20 23:20:01 bitis-gabonica kernel: NOHZ: local_softirq_pending 20
Feb 21 05:10:28 bitis-gabonica kernel: NOHZ: local_softirq_pending 02
Feb 21 05:10:46 bitis-gabonica kernel: NOHZ: local_softirq_pending 02
^^^^^^^^^

Feb 21 05:10:57 bitis-gabonica kernel: NOHZ: local_softirq_pending 20
Feb 21 05:11:48 bitis-gabonica kernel: NOHZ: local_softirq_pending 02
Feb 21 05:12:02 bitis-gabonica kernel: NOHZ: local_softirq_pending 20
Feb 21 05:12:02 bitis-gabonica kernel: NOHZ: local_softirq_pending 20
Feb 21 05:12:23 bitis-gabonica kernel: NOHZ: local_softirq_pending 02
Feb 21 05:12:29 bitis-gabonica kernel: NOHZ: local_softirq_pending 20
Feb 21 05:13:27 bitis-gabonica kernel: NOHZ: local_softirq_pending 20
Feb 21 05:13:49 bitis-gabonica kernel: NOHZ: local_softirq_pending 02
Feb 21 05:13:54 bitis-gabonica kernel: NOHZ: local_softirq_pending 20
Feb 21 05:14:48 bitis-gabonica kernel: NOHZ: local_softirq_pending 02
Feb 21 05:15:11 bitis-gabonica kernel: NOHZ: local_softirq_pending 20
Feb 21 05:15:13 bitis-gabonica kernel: NOHZ: local_softirq_pending 20
Feb 21 05:15:16 bitis-gabonica kernel: NOHZ: local_softirq_pending 02
Feb 21 05:16:18 bitis-gabonica kernel: NOHZ: local_softirq_pending 20
Feb 21 05:16:31 bitis-gabonica kernel: NOHZ: local_softirq_pending 02
Feb 21 05:17:03 bitis-gabonica kernel: NOHZ: local_softirq_pending 02
Feb 21 05:17:06 bitis-gabonica kernel: NOHZ: local_softirq_pending 20
Feb 21 05:17:09 bitis-gabonica kernel: NOHZ: local_softirq_pending 20
Feb 21 05:17:12 bitis-gabonica kernel: NOHZ: local_softirq_pending 02
Feb 21 05:17:17 bitis-gabonica kernel: NOHZ: local_softirq_pending 20
Feb 21 05:17:18 bitis-gabonica kernel: NOHZ: local_softirq_pending 02
^^^^^^^^
23 times in 7 minutes.


Regards,
Michal

--
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/