Re: [BENCHMARK] Lmbench 2.5.54-mm2 (impressive improvements)

From: Andrew Morton (akpm@digeo.com)
Date: Fri Jan 03 2003 - 05:22:54 EST


"David S. Miller" wrote:
>
> On Fri, 2003-01-03 at 01:33, Andrew Morton wrote:
> > I'm sorry, but all you are doing with these tests is discrediting
> > lmbench, AIM9, tiobench and unixbench.
> ...
> > Possibly, it is all caused by cache colouring effects - the physical
> > addresses at which critical kernel and userspace text and data
> > happen to end up.
> ...
> > The teeny little microbenchmarks are telling us that the rmap overhead
> > hurts, that the uninlining of copy_*_user may have been a bad idea, that
> > the addition of AIO has cost a little and that the complexity which
> > yielded large improvements in readv(), writev() and SMP throughput were
> > not free. All of this is already known.
>
> I think if anything, you are stating the true value of the
> microbenchmarks. They are showing us how the kernel is getting
> more and more complex, causing basic operations to take longer
> and longer. That's bad. :-)

Yup. But these things are already known about.

> Last time I brought up an issue like this (a "nobody but weirdos use
> feature which is costing us cycles everywhere"), it got redone until
> it did cost nothing for people who don't use the feature. See the
> whole security layer fiasco for example.

There would be some small benefit in disabling the per-cpu-pages
pools on uniprocessor, and probably the deferred lru-addition queues.

That's fairly simple to do but I didn't do it because it would mean
that SMP and UP are running significantly different codepaths. Benching
this is on my todo list somewhere.
 
> I truly wish I could config out AIO for example, the overhead is just
> stupid. I know that if some thought is put into it, the cost could
> be consumed completely.

hm. Its cost in filesystem/VFS land is quite small. I assume you're
referring to networking here?

> People who don't see the true value of researching even minor jitters
> in lmbench results (and fixing the causes or backing out the guilty
> patch) aren't kernel developers in my opinion. :-)

But the statistically significant differences _are_ researched, and are
well understood.

We should't lose sight of large optimisations which happen to not be
covered by these tests. eg: SMP scalability.

To cite an extreme case, the readv/writev changes sped up O_SYNC and
O_DIRECT writev() by up to 300x and buffered writev() by 3x. But it cost
us a few percent on write(fd, buf, 1).

quad:/usr/src> grep -r writev lmbench
quad:/usr/src> grep -r writev aim9
quad:/usr/src> grep -r writev tiobench
quad:/usr/src> grep -r writev unixbench-4.1.0-971022
quad:/usr/src>

The big, big one here is the reverse map. I still don't believe that
its benefit has been shown to exceed its speed and space costs.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Jan 07 2003 - 22:00:21 EST