Re: Local DoS (was: Strange 'zombie' problem both in 2.4 and 2.6)
From: David Lang
Date: Mon Jun 14 2004 - 20:34:38 EST
On Mon, 14 Jun 2004, Marcelo Tosatti wrote:
On Mon, Jun 14, 2004 at 10:01:53AM -0700, David Lang wrote:
I think I may be running into the same (or a similar) issue with a
workload that forks heavily (~3500 forks/sec). What can I do to let the
system survive this sort of load?
Hi David,
v2.6.7-mm tree contains a fix for this, adding a rlimit for
pending signals.
I'll have to give this a try.
Can you describe the problem you are seeing in more detail?
I have a stress-test I am running on a dual Opteron 1.4GHz box that
receives a network connection, forks a new process, does a little bit of
network traffic then the child exits. when I hammer this I get ~3500
connections/sec (with a significant amount of spare CPU, I'm limited by
my load boxes), but after a few secnds (8-10) something happens and the
parent stops receiving the sigchild signals. if I connect strace to the
parent process the signals are re-enabled and everything works for a
little bit longer before the process repeats.
if I only hit it with ~10,000 connections and then pause the box survives
indefinantly
running the same test on a dual athlonMP 2200+ I get ~2500 connections a
sec and it has no problems. I just compiled a 32 bit kernel for the
opteron and get ~3300 connections/sec (with no idle CPU time) and the box
doesn't lock up.
I don't know if this is becouse it's just below the threashold of the
problem or if there is a bug in the 64 bit kernel (or both)
I'm currently trying to tweak the 32 bit opteron kernel to get a smidge
more speed out of it to see if getting back up to the same speed starts
triggering the problem again.
David Lang
--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/