Re: Linux 2.6.25-rc4

From: Bartlomiej Zolnierkiewicz
Date: Wed Mar 19 2008 - 16:41:04 EST


On Wednesday 19 March 2008, Linus Torvalds wrote:
>
> On Wed, 19 Mar 2008, Bartlomiej Zolnierkiewicz wrote:
> >
> > Please take a closer look - my "real fix" _only_ affects WIN_SMART command
> > and _not_ vendor special ones (no, there are none vendor special commands
> > using the same opcode).
>
> Oh, trust me, I understand your fix.
>
> But the point is that your fix
>
> - doesn't fix any other commands (and there are tons of other commands
> people may be using)
>
> - and is totally pointless and isn't *needed* if we just fix this
> properly (ie we've never cared before, so why should we start caring
> now?).
>
> See?
>
> Fixing this properly is exactly what my patch did - it made the whole core
> engine not really even care. Just like it used to.
>
> > Call taskfile crap all you want [ ... ]
>
> You're totally not listening or understanding.
>
> I'm not calling taskfile crap as a concept.
>
> I'm calling fragile code that fails unnecessarily crap.
>
> See the difference?
>
> The old "drive_cmd_intr()" code (that you deleted) was fundamentally more
> robust than the taskfile code you replaced it with.
>
> And that's not just an opinion. It's a *fact*. It's why this bug showed up
> in the first place. Code that used to work because it didn't even care all
> that deeply about every single detail being set up just the way it wanted
> got replaced by code that was fragile.
>
> Do you see the difference?
>
> If we have a choice between robust code that just "does the right thing"
> even in the face of problems, and code that "stops working when you look
> at it wrong", which one should we choose? Which one is the better code?
> Which one is crap, and which one isn't?
>
> The fact is, the old "drive_cmd_intr()" code was simply more robust.
>
> So this is why I feel so strongly about this. Robust code is just about
> the most important thing we can have in the kernel. Bugs will always
> happen, but when they happen, we shouldn't just fall over dead. And that's
> exactly what code robustness is all about.
>
> So we should make the DRQ/ERR status bit handling robust in the face of
> hardware and software that does odd things. Because we definitely have
> seen cases of both. Sometimes it is hw that doesn't work the way the spec
> requires, sometimes it's software that doesn't follow the spec 100%. It
> doesn't matter.
>
> And once the status bit handling is robust (like it _used_ to be!), only
> *then* should we even ask ourself if we even care about having some random
> tfargs.data_phase value for specific SMART command sub-cases.
>
> My personal opinion is obviously that we simply shouldn't even care (since
> we should now know that the driver doesn't care one way or the other), but
> if you want to apply your patch despite it having absolutely no meaning,
> that's your choice.
>
> But the absolute first thing we should do is to make the code at least as
> robust as it used to be, and preferably aim _higher_ in robustness rather
> than lower!

OK, I got the point.

Your patch is more robust and we should go with it
(and thanks for fixing this bug!).

Bart
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/