Re: Linux 2.6.29

From: Andrew Morton
Date: Thu Mar 26 2009 - 21:28:30 EST


On Thu, 26 Mar 2009 18:03:15 -0700 (PDT) Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

>
>
> On Thu, 26 Mar 2009, Andrew Morton wrote:
> >
> > userspace can get closer than the kernel can.
>
> Andrew, that's SIMPLY NOT TRUE.
>
> You state that without any amount of data to back it up, as if it was some
> kind of truism. It's not.

I've seen you repeatedly fiddle the in-kernel defaults based on
in-field experience. That could just as easily have been done in
initscripts by distros, and much more effectively because it doesn't
need a new kernel. That's data.

The fact that this hasn't even been _attempted_ (afaik) is deplorable.

Why does everyone just sit around waiting for the kernel to put a new
value into two magic numbers which userspace scripts could have set?

My /etc/rc.local has been tweaking dirty_ratio, dirty_background_ratio
and swappiness for many years. I guess I'm just incredibly advanced.

> Everybody accepts that if you've written a 20MB file and then call
> "fsync()" on it, it's going to take a while. But when you've written a 2kB
> file, and "fsync()" takes 20 seconds, because somebody else is just
> writing normally, _that_ is a bug. And it is actually almost totally
> unrelated to the whole 'dirty_limit' thing.
>
> At least it _should_ be.

That's different. It's inherent JBD/ext3-ordered brain damage.
Unfixable without turning the fs into something which just isn't jbd/ext3
any more. data=writeback is a workaround, with the obvious integrity
issues.

The JBD journal is a massive designed-in contention point. It's why
for several years I've been telling anyone who will listen that we need
a new fs. Hopefully our response to all these problems will soon be
"did you try btrfs?".

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/