Re: kernel BUG at mm/rmap.c:483!

From: Ammar T. Al-Sayegh
Date: Wed Feb 23 2005 - 17:51:17 EST


----- Original Message ----- From: "Hugh Dickins" <hugh@xxxxxxxxxxx>
To: "Ammar T. Al-Sayegh" <ammar@xxxxxxxxx>
Cc: <linux-kernel@xxxxxxxxxxxxxxx>
Sent: Wednesday, February 23, 2005 4:31 PM
Subject: Re: kernel BUG at mm/rmap.c:483!


On Wed, 23 Feb 2005, Ammar T. Al-Sayegh wrote:

Any suggestion on what else I can do to mitigate this
problem?

The first thing to do is to give memtest86 a good (say
overnight) run. Many of the rmap.c BUG reporters have
subsequently found memtest86 failures, and we believe those
instances are accounted for by bad memory. And if that's so
in your case, you don't really want to be running 2.4 on it.

But not all cases could be accounted in that way. If you
report back that memtest86 ran cleanly, then I'll have to
rework a debug patch against your Fedora RC3 kernel to try
to give us more info - though quite possibly you cannot afford
such experiments on this server, and will revert to 2.4 for now.

The problem is that my server is already in production
mode. I'm running great portion of my business on it,
where there is very little tolerance for downtime.
Because the server is located in a remote datacenter,
every time it goes down it takes several hours to have
someone sent up there to manually reboot it for a hefty
emergency fee. So this bug has already cost me a lot of
money, and I'm worried that it will cost me a lot of my
clients as well if it persists.

Remote hands are rather expensive, so it will cost me
$100/hr to have someone runs memtest86 on my server
since I can't perform it remotely. I'll do it though
since that's your recommendation for the time being.
Hope it will not take more than an hour to run the
test, and hope it turns out as bad memory modules as
you expect because I hate to downgrade after all the
time and money I expended on the upgrade.


-ammar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/