Re: Linux 2.6.29

From: Alan Cox
Date: Tue Mar 24 2009 - 08:24:34 EST


> The proposed tweak to kjournald is a bad fix - partly because it will
> elevate the priority of vast amounts of IO whose priority we don't _want_
> elevated.

Its a huge improvement in practice because it both fixes the stupid
stalls and smooths out the rest of the I/O traffic. I spend a lot of my
time looking at what the disk driver is getting fed and its not a good
mix. Even more revealing is the noop scheduler and the fact this
frequently outperforms all the fancy I/O scheduling we do even on
relatively dumb hardware (as well as showing how mixed up our I/O
patterns currently are).

> But mainly because the problem lies elsewhere - in an area of contention
> between the committing and running transactions which we knowingly and
> reluctantly added to fix a bug in

The problem emerges about 2007 not 2002, so its not that simple.

> The number of people who can be looked at to do serious ext3/JBD work is
> pretty small now. Ted, Stephen and I got old and died. Jan does good work
> but is spread thinly.

Which is all the more reason to use a temporary fix in the meantime so
the OS is usable. I think its pretty poor that for over a year those in
the know who need a good performing system are having to apply out of
tree trivial patches rejected on the basis that "eventually like maybe
whenever perhaps we'll possibly some day you know consider fixing this,
but don't hold your breath"

There is a second reason to do this: If ext4 is the future then it is far
better to fix this stuff in ext4 properly and leave ext3 clear of
extremely invasive high risk fixes when a quick bandaid will do just fine
for the remaining lifetime of fs/jbd

Also not kjournald is only one of the afflicted threads - the same is
true of the crypto, and of the vm writeback. Also note the other point
about the disk scheduler defaults being terrible for some streaming I/O
patterns and the patch for that is also stuck in bugzilla.

If picking "no-op" speeds up my generic x86 box with random onboard SATA
we are doing something very non-optimal
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/