Re: divide by zero error: find busiest group on kernel 2.6.38.4

From: Urban Loesch
Date: Mon Dec 19 2011 - 03:02:13 EST


Hi,

On 17.12.2011 00:14, Shawn Bohrer wrote:
On Sun, Dec 04, 2011 at 03:00:00PM +0100, Urban Loesch wrote:
I'm running a DELL PE R610 with kernel
2.6.38.4 patched with linux vserver version vs2.3.0.37-rc15 from
http://linux-vserver.org.

The server runs fine about 220 days without any problems.
But last night there was a kernel panic and the server totally hangs.

Thanks to netconsole I got the following error in my syslogserver:


2011-12-04 00:32:16 divide error: 0000 [#1]
2011-12-04 00:32:16 SMP
<snip>
2011-12-04 00:32:16 Pid: 0, comm: kworker/0:1 Not tainted
2.6.38.4-vs2.3.0.37-rc15-rol-em64t #1
2011-12-04 00:32:16
2011-12-04 00:32:16 Dell Inc. PowerEdge R610
2011-12-04 00:32:16 /
2011-12-04 00:32:16 0F0XJ6
2011-12-04 00:32:16
2011-12-04 00:32:16 RIP: 0010:[<ffffffff8103abb8>]
2011-12-04 00:32:16 [<ffffffff8103abb8>]
find_busiest_group+0x428/0xdd0

This looks like the same issue as:

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=636797
and
https://bugs.launchpad.net/linux/+bug/614853

In theory there is also a bugzilla.kernel.org ticket on this issue as
well though bugzilla.kernel.org is still down.

https://bugzilla.kernel.org/show_bug.cgi?id=16991

Debian and Ubuntu have papered over this bug by skipping the divide if
cpu_power is 0.

I searched the archives but I didn't find any related information.
Have you any idea what this error could be and is it fixed in kernel 3.1?

To my knowledge the cause of this bug is still unknown. It is
possible it is fixed in newer kernels, but it is hard to tell since it
doesn't seem to occur until you have reached 200+ days of uptime.


Not sure if that describes exactly the same problem:

http://comments.gmane.org/gmane.linux.kernel/1132515

Patch:
http://git.kernel.org/?p=linux/kernel/git/tip/tip.git;a=commitdiff;h=4cecf6d401a01d054afc1e5f605bcbfe553cb9b9

This issue was fixed in 3.1.5.
http://www.kernel.org/pub/linux/kernel/v3.0/ChangeLog-3.1.5

--
Shawn


Thanks
Urban
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/