On Fri, May 31, 2002 at 01:27:44PM +0530, BALBIR SINGH wrote:
> |
> |
> | CPU #0 CPU#1
> |
> | --------- --------- Start of cache line
> | *ctrp1 *ctrp1
> | *ctrp2 *ctrp2
> |
> | . .
> | . .
> | . .
> | . .
> | . .
> |
> | --------- ---------- End of cache line
>
>
> Won't this result in a lot of false sharing, if any of the CPUs
> tried to access any of the counters, the entire cache line would be
> moved from the current CPU to that CPU. Isn't this a very bad thing or
> am I missing something? Do all your counters fit into one cache line.
Yes it could result in false sharing. You could probably avoid
that by imposing classes of allocation - say STRICLY_LOCAL and
ALMOST_LOCAL, so that strictly local objects are not penalized
by occasionally non-local objects. If your code frequently accesses
other CPU's copy of the object than you should not be using this
per-cpu allocator in the first place, it would be meaningless.
>
> For sometime now, I have been thinking of implementing/supporting
> PME's (Peformance Monitoring Events and Counters), so that we can
> get real values (atleast on x86) as compared to our guesses about
> cacheline bouncing, etc. Do you know if somebody is already doing
> this?
You can use SGI kernprof to measure PMCs. See the SGI oss
website for details. You can count L2_LINES_IN event to
get a measure of cache line bouncing.
Thanks
-- Dipankar Sarma <dipankar@in.ibm.com> http://lse.sourceforge.net Linux Technology Center, IBM Software Lab, Bangalore, India. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Fri May 31 2002 - 22:00:30 EST