Re: 2.4.18pre4aa1

From: Daniel Phillips (phillips@bonn-fries.net)
Date: Mon Jan 28 2002 - 04:53:25 EST


On January 25, 2002 01:09 am, Andrea Arcangeli wrote:
> On Thu, Jan 24, 2002 at 07:27:43AM +0100, Daniel Phillips wrote:
> > On January 24, 2002 06:23 am, rwhron@earthlink.net wrote:
> > > Benchmarks on 2.4.18pre4aa1 and lots of other kernels at:
> > > http://home.earthlink.net/~rwhron/kernel/k6-2-475.html
> >
> > "dbench 64, 128, 192 on ext2fs. dbench may not be the best I/O
> > benchmark, but it does create a high load, and may put some pressure on
> > the cpu and i/o schedulers. Each dbench process creates about 21
> > megabytes worth of files, so disk usage is 1.3 GB, 2.6 GB and 4.0 GB
> > for the dbench runs. Big enough so the tests cannot run from the
> > buffer/page caches on this box."
> >
> > Thanks kindly for the testing, but please don't use dbench any more for
> > benchmarks. If you are testing stability, fine, but dbench throughput
> > numbers are not good for much more than wild goose chases.
> >
> > Even when mostly uncached, dbench still produces flaky results.
>
> this is not enterely true. dbench has a value.

Yes, but not for benchmarks. It has value only as a stability test - while
it may in some cases provide some general indication of performance, its
variance is far too large, even under controlled conditions, for it to have
any value as a benchmark. I'm surprised you'd even suggest this.

Andrea, please, if we want good benchmarks let's at least be clear on what
tools benchmarkers should/should not be using.

> the only problem with
> dbench is that you can trivially cheat and change the kernel in a broken
> way, but optimal _only_ for dbench, just to get stellar dbench numbers,

No, this is not the only problem. DBench is just plain *flaky*. You don't
appear to be clear on why. In short, dbench has two main flaws:

  - It's extremely sensitive to scheduling. If one process happens to make
    progress then it gets more heavily cached and its progress becomes even
    greater. The benchmark completes much more quickly in this case, whereas
    if all processes progress at nearly the same rate (by chance) it runs
    more slowly.

  - It can happen (again by chance) that dbench files get deleted while still
    in cache, and this process completes in a fraction of the time that real
    disk IO would require.

I've seen successsive runs of dbench *under identical conditions* (that is,
from a clean reboot etc.) vary by as much as 30%. Others report even greater
variance. Can we please agree that dbench is useless for benchmarks?

-- 
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Jan 31 2002 - 21:00:49 EST