Re: [DRBD-user] Hanging kernel

From: Steve Thompson
Date: Thu Sep 23 2010 - 08:53:38 EST


On Wed, 22 Sep 2010, Alex Adriaanse wrote:

I'm administering a server that has frozen three times over the past two days. During these times, it seemed that most processes would all of a sudden start hanging, and I couldn't SSH into the server or even log into the console. I would start seeing messages like "INFO: task kswapd0:28 blocked for more than 120 seconds" on the console shortly after the processes hung. The only way I could get the server to respond again was by resetting it.

This may not apply in your situation, but the only times I have ever seen this (and I've seen it several times), it was due to VM parameters that were inappropriate for the work load. Usually, if you wait long enough (sometimes as much as 20 minutes), the system will recover and continue. What are your values of:

sysctl vm.dirty_ratio
sysctl vm.dirty_background_ratio

I have had success in these situations by setting the former to 50, and the latter to 5, but the optimum values are sensitive to your peak load.
I don't believe that it has any direct connection with drbd.

Steve
----------------------------------------------------------------------------
Steve Thompson E-mail: smt AT vgersoft DOT com
Voyager Software LLC Web: http://www DOT vgersoft DOT com
39 Smugglers Path VSW Support: support AT vgersoft DOT com
Ithaca, NY 14850
"186,300 miles per second: it's not just a good idea, it's the law"
----------------------------------------------------------------------------
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/