Re: [BK PATCH] USB changes for 2.5.34

From: Linus Torvalds (torvalds@transmeta.com)
Date: Mon Sep 09 2002 - 21:49:11 EST


Short and sweet:

  If I have a BUG() that I need to remove and replace with a printk()
  (which I did) in order to even be able to debug the dang thing, then
  that BUG() is a mistake. No ifs, buts or maybe's accepted. The one and
  _only_ point of BUG() is to help debugging, and if it doesn't, then it
  should not be there.

Longer:

On Mon, 9 Sep 2002, Matthew Dharm wrote:
>
> (3) Given that we think we've fixed all of these bad initiators, the BUG()
> really is something that should never happen.

You can't have it both ways.

Either it really never happens.

Or it _does_ happen, in which case you want sane debuggable output.

In the latter case, BUG() is the wrong thing to do, exactly because it
tends to bring the system down. In the first case it obviously doesn't
matter and isn't even relevant, and the best thing to do would be to
remove it altogether. In neither case is BUG() useful.

I'm personally in X 99% of the time except for the reasonably rare case
when I'm chasing down some bug I know I can reproduce and I want the
kernel to have access to the console.

And I doubt I'm alone in that. I suspect most people who use Linux in any
interesting situation (and no, I don't think servers are very interesting
from most standpoints) tend to do this. Agreed?

That means that the BUG() output won't even be _visible_ to 99% of all
cases. And if it causes the machine to hang, it is now not only invisible,
it cannot even be gotten hold of some other way.

Which basically means that a driver that uses BUG() for something that
isn't certain to be fatal anyway is a BUGGY driver.

If my X session is hung, that's already bad (most people will give up at
that point). If I can't even log in remotely, that's _disastrous_, since
now the BUG() causes me to not to be able to debug it at all. The BUG()
basically made itself irrlevant.

The fact is, BUG() is almost always the wrong thing to do. And it's
almost _guaranteed_ to be the wrong thing to do in a driver, since a
driver won't know what locks the rest of the kernel is holding, _and_
since it's almost always possible for a driver to try to return an error
code instead.

In short:
 
  Either you want debugging (in which case BUG() is the wrong thing to
  do), or you don't want debugging (in which case BUG() is the wrong thing
  to do). You can choose either, but in neither case is BUG() acceptable.

  BUG() is fine for _fundamental_ problems, where you don't have any other
  choice, and where the machine really is effectively dead anyway. If the
  VM notices that it's lists are corrupt, that's a BUG() thing. We don't
  have much choice. If the scheduler notices that it's running on another
  CPU than it thought it was running on, that's a BUG() thing.

Now, the kernel problem may be that BUG() is a bit too easy to use, and
the alternatives are not. We should fix that. But we shouldn't fix it by
using BUG() in places where it definitely doesn't belong.

                Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Sep 15 2002 - 22:00:20 EST