Re: [RFC] AutoNUMA alpha6

From: Andrea Arcangeli
Date: Wed Mar 21 2012 - 08:13:44 EST


On Wed, Mar 21, 2012 at 08:12:58AM +0100, Ingo Molnar wrote:
>
> * Andrea Arcangeli <aarcange@xxxxxxxxxx> wrote:
>
> > [...]
> >
> > So give me a break... you must have made a real mess in your
> > benchmarking. numasched is always doing worse than upstream
> > here, in fact two times massively worse. Almost as bad as the
> > inverse binds.
>
> Andrea, please stop attacking the messenger.

I am simply informing him. Why should not inform him that the way he
performed the benchmark wasn't the best way?

I informed him because it wasn't entirely documented how to properly
run by benchmark set. I would have expected people to read my pdf I
posted 2 months ago already that explains it:

http://www.kernel.org/pub/linux/kernel/people/andrea/autonuma/
http://www.kernel.org/pub/linux/kernel/people/andrea/autonuma/autonuma_bench-20120126.pdf

Jump to page 7.

Two modes:

numa01 -DNO_BIND_FORCE_SAME_NODE
numa01 -DTHREAD_ALLOC

I recommend Dan to now as last thing repeat the numasched benchmark
with the numa01 built was -DNO_BIND_FORCE_SAME_NODE.

For me neither -DNO_BIND_FORCE_SAME_NODE nor DTHREAD_ALLOC nor numa02
perform, in fact numa01 tends to hang and they never end.

> We wanted and needed more testing, and I'm glad that we got it.

Yes, I also posted the specjbb and I did a kernel build as measurement
of the worst case overhead of the numa hinting page fault.

You can see it here:

http://www.kernel.org/pub/linux/kernel/people/andrea/autonuma/autonuma_bench-20120321.pdf

> Can we please figure out all the details *without* accusing
> anyone of having made a mess? It is quite possible as well that
> *you* made a mess of it somewhere, either at the conceptual
> stage or at the implementational stage, right?

I didn't make a mess. I also repeated without lockdep still same
thing, in fact now it never ends. I'll have to reboot a few more times
to see if I can get at least some number out.

Maybe it takes -DNO_BIND_FORCE_SAME_NODE to show the brokeness, I'll
wait Dan to repeat the numasched test with either
-DNO_BIND_FORCE_SAME_NODE or -DTHREAD_ALLOC.

Or maybe the higher ram (24G vs my 16G) could have played a role.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/