Re: Kernel Panic on KVM Guests: "Scheduling while atomic: swapper''

From: Gleb Natapov
Date: Tue Aug 23 2011 - 02:39:59 EST


On Wed, Aug 17, 2011 at 10:40:15PM -0500, Iggy Iggy wrote:
> I've started seeing kernel panics on a few of our virtual machines
> after moving them (qemu-kvm, libvirt) off of a box with two Intel Xeon
> X5650 processors (12 cores total) onto one with four AMD Opteron 6174
> processors (48 cores total).
>
> What is odd is that I feel like the panic is moving around on these
> virtual machines. It was only happening on one for a bit and then it
> stopped but started happening on another virtual machine. It also
> doesn't happen all the time but it can also happen frequently. Two
> days of not happening vs every four to six hours. The machine still
> functions to an extent but over time it crawls and needs to be
> destroyed and started back up.
>
> This is the panic:
> Jul 20 06:35:47 test-db kernel: [10881.413875] BUG: scheduling while
> atomic: swapper/0/0x00010000
> Jul 20 06:35:47 test-db kernel: [10881.414184] Modules linked in:
> nf_conntrack_ftp i2c_piix4 i2c_core joydev virtio_net virtio_balloon
> virtio_blk virtio_pci virtio_ring virtio [last unloaded:
> scsi_wait_scan]
> Jul 20 06:35:47 test-db kernel: [10881.414196] Pid: 0, comm: swapper
> Not tainted 2.6.35.11-83.fc14.x86_64 #1
> Jul 20 06:35:47 test-db kernel: [10881.414198] Call Trace:
> Jul 20 06:35:47 test-db kernel: [10881.414205] [<ffffffff8103ffbe>]
> __schedule_bug+0x5f/0x64
> Jul 20 06:35:47 test-db kernel: [10881.414208] [<ffffffff8146845e>]
> schedule+0xd9/0x5cb
> Jul 20 06:35:47 test-db kernel: [10881.414214] [<ffffffff81072e20>] ?
> hrtimer_start_expires.clone.5+0x1e/0x20
> Jul 20 06:35:47 test-db kernel: [10881.414219] [<ffffffff81008345>]
> cpu_idle+0xca/0xcc
> Jul 20 06:35:47 test-db kernel: [10881.414223] [<ffffffff81451c66>]
> rest_init+0x8a/0x8c
> Jul 20 06:35:47 test-db kernel: [10881.414227] [<ffffffff81ba1c49>]
> start_kernel+0x40b/0x416
> Jul 20 06:35:47 test-db kernel: [10881.414231] [<ffffffff81ba12c6>]
> x86_64_start_reservations+0xb1/0xb5
> Jul 20 06:35:47 test-db kernel: [10881.414234] [<ffffffff81ba13c2>]
> x86_64_start_kernel+0xf8/0x107
>
> The new server is running Scientific Linux 6.0 with kernel
> 2.6.32-131.6.1.el6.x86_64. One of the guests I see this on is running
> Fedora Core 14, kernel 2.6.35.13-92.fc14.x86_64 and the other is
> running Fedora Core 12, kernel 2.6.32.26-175.fc12.x86_64.
>
This is RHEL bug [1], not upstream one and should be reported elsewhere.
Just for the record the bug is fixed on the latest RHEL kernel.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=683658

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/