Re: Xterm Hangs - Possible scheduler defect?

From: Chad N. Tindel
Date: Thu Feb 24 2005 - 12:36:33 EST


> Why would anyone write a thread than uses exactly 100% of one cpu?
> It seems wrong to me. It is unsafe if they really need that much
> processing power, what if an interrupt happens? Then they get slightly less.
> If they got a 10% faster cpu, would this thread suddenly drop to 90%
> usage (and no problems with kernel threads)?
> If it stays at 100% then that suggests they are using some
> sort of busy waiting which is bad programming. Get a better hw
> device that will provide an interrupt at the right time, and write a
> driver for
> that. Or find some spot in the code where a small delay in acceptable,
> and set a short timer and sleep on it. It doesn't take much to get this
> kernel thread going.

I would make the following assertion for any kernel:

No single userspace thread of execution running on an SMP system should be
able to hose a box by going CPU-bound, bug in the software or no bug. Any
kernel should be able to handle this case and shift general work over to
other processors.

While I can't speak for all commercial Unixes, I know that HP-UX handles this
case just fine. I'd be extremely surprised if Solaris and AIX didn't handle
it fine too. What I can't understand is why you want to cop-out and say
"Oh well this is just a bug in the application... the programmer shouldn't
shoot himself in the foot." If that were the attitude that kernel programmers
had, why have the kernel send SIGSEGV when applications reference invalid
memory? Why not just let them corrupt the memory of other apps and possibly
bring the whole system down?

It is the kernel's job to protect itself and userspace applications from
runaway applications whenever possible. While this might not be possible for
this case on a UP box, it certainly is for an SMP box.

Chad
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/