Re: [PATCH] barrier patch set

From: Chris Mason
Date: Wed Mar 31 2004 - 09:27:28 EST


On Wed, 2004-03-31 at 09:03, Stephen C. Tweedie wrote:
> Hi,
>
> On Tue, 2004-03-30 at 23:13, Chris Mason wrote:
>
> > Most database benchmarks are done on scsi, and the blkdev_flush should
> > be a noop there. For IDE based database and mail server benchmarks, the
> > results won't be pretty.
>
> Yep. I'm really not too worried about big database benchmarks -- those
> are very much special cases, using rather specialised storage setup
> (SCSI or FC, striped over lots of small disks rather than fewer large
> ones.) I'm much more concerned about your average LAMP user's mysql
> database, and how to keep performance sane on that.
>
In some cases, it's going to be so much slower that it will look like
the old code wasn't writing the data at all. I don't think there's much
we can do about that.

> > The reiserfs fsync code tries hard to only flush once, so if a commit is
> > done then blkdev_flush isn't called. We might have to do a few other
> > tricks to queue up multiple synchronous ios and only flush once.
>
> Batching is really helpful when you've got lots of threads that can be
> coalesced, yes. ext3 does that for things like mail servers. I'm not
> sure whether the same tricks will apply to the various databases out
> there, though.

We can do better in general when there's more then one process doing an
fsync. reiserfs and ext3 both try to be smart about batching log
commits, but I think we could do more to streamline the data writes.

I'm playing with a few ideas, I'll post more when I've got real code to
back things up.

If there's only one process doing fsyncs, there's not much the kernel
can do except provide an aio fsync call.

-chris




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/