RE: [Lse-tech] Re: [RFC] Dynamic percpu data allocator

From: BALBIR SINGH (balbir.singh@wipro.com)
Date: Fri May 31 2002 - 02:57:44 EST


|Actually I don't know for sure what plans are afoot to fix the
|slab allocator
|for per-cpu. One plan I heard about was allocating from per-cpu pools
|rather than per-cpu copies. My requirements are similar to
|the hot list skbs. I want to do this -
|
| int *ctrp1, *ctrp2;
|
| ctrp1 = kmalloc_percpu(sizeof(*ctrp1), GFP_ATOMIC);
| if (ctrp1 == NULL) {
| /* recover */
| }
| ctrp2 = kmalloc_percpu(sizeof(*ctrp2), GFP_ATOMIC);
| if (ctrp2 == NULL) {
| /* recover */
| }
|
| *per_cpu_ptr(ctrp1, smp_processor_id())++;
| this_cpu_ptr(ctrp2)++;
|
|Now I can allocate by making ctrp1/ctrp2 point to an array
|of NR_CPUS and kmalloc() memory for each CPU's copy of the
|int. This is simple and will work.
|
| void **ptrs = kmalloc(sizeof(*ptrs) * NR_CPUS, flags);
|
| if (!ptrs) return NULL;
| for (i = 0; i < NR_CPUS; i++) {
| ptrs[i] = kmalloc(size, flags);
| if (!ptrs[i])
| goto unwind_oom;
| }
|
|
|However I would like to use kmalloc_percpu() for allocating very
|small objects - typlically integer counters or small structures
|to be used as per-cpu counters for things like dst entries and
|dentries.
|kmalloc will waste the rest of the cache line for such small objects.
|The alternative is to use a layer of code to interleave small objects
|and save on space.
|
|
| CPU #0 CPU#1
|
| --------- --------- Start of cache line
| *ctrp1 *ctrp1
| *ctrp2 *ctrp2
|
| . .
| . .
| . .
| . .
| . .
|
| --------- ---------- End of cache line

Won't this result in a lot of false sharing, if any of the CPUs
tried to access any of the counters, the entire cache line would be
moved from the current CPU to that CPU. Isn't this a very bad thing or
am I missing something? Do all your counters fit into one cache line.

For sometime now, I have been thinking of implementing/supporting
PME's (Peformance Monitoring Events and Counters), so that we can
get real values (atleast on x86) as compared to our guesses about
cacheline bouncing, etc. Do you know if somebody is already doing
this?

Regards,
Balbir

|
|I have an allocator that interleaves objects like this if they
|can be fitted
|into size that is a factor of SMP_CACHE_BYTES.
|
|I hope someone can tell me that I don't even have to do this. Otherwise
|I will go ahead and do my thing.
|
|Thanks
|--
|Dipankar Sarma <dipankar@in.ibm.com> http://lse.sourceforge.net
|Linux Technology Center, IBM Software Lab, Bangalore, India.
|
|_______________________________________________________________
|
|Don't miss the 2002 Sprint PCS Application Developer's Conference
|August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm
|
|_______________________________________________
|Lse-tech mailing list
|Lse-tech@lists.sourceforge.net
|https://lists.sourceforge.net/lists/listinfo/lse-tech
|



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri May 31 2002 - 22:00:30 EST