Re: hackbench regression due to commit 9dfc6e68bfe6e

From: Christoph Lameter
Date: Thu Mar 25 2010 - 10:49:52 EST


On Thu, 25 Mar 2010, Alex Shi wrote:

> SLUB: Use this_cpu operations in slub
>
> The hackbench is prepared hundreds pair of processes/threads. And each
> of pair of processes consists of a receiver and a sender. After all
> pairs created and ready with a few memory block (by malloc), hackbench
> let the sender do appointed times sending to receiver via socket, then
> wait all pairs finished. The total sending running time is the indicator
> of this benchmark. The less the better.

> The socket send/receiver generate lots of slub alloc/free. slabinfo
> command show the following slub get huge increase from about 81412344 to
> 141412497, after command "backbench 150 thread 1000" running.

The number of frees is different? From 81 mio to 141 mio? Are you sure it
was the same load?

> Name Objects Alloc Free %Fast Fallb O
> :t-0001024 870 141412497 141412132 94 1 0 3
> :t-0000256 1607 141225312 141224177 94 1 0 1
>
>
> Via perf tool I collected the L1 data cache miss info of comamnd:
> "./hackbench 150 thread 100"
>
> On 33-rc1, about 1303976612 time L1 Dcache missing
>
> On 9dfc6, about 1360574760 times L1 Dcache missing

I hope this is the same load?

What debugging options did you use? We are now using per cpu operations in
the hot paths. Enabling debugging for per cpu ops could decrease your
performance now. Have a look at a dissassembly of kfree() to verify that
there is no instrumentation.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/