Re: Regression in reading /proc/stat in the newer kernels withlarge SMP and NUMA configurations

From: Eric Dumazet
Date: Fri Oct 14 2011 - 06:34:47 EST


Le vendredi 14 octobre 2011 Ã 03:08 -0700, David Rientjes a Ãcrit :

> The overhead is probably in kstat_irqs_cpu() which is called for each
> possible irq for each of the 32 possible cpus, and /proc/stat actually
> does the sum twice. You would see the same type of overhead with
> /proc/interrupts if it wasn't masked by the locking that it requires to
> safely read irq_desc. "dmesg | grep nr_irqs" will show how many percpu
> variables are being read for every cpu twice.

One annoying thing with most HP servers is they claim more possible cpus
than reality. So each time we have to loop on possible cpus, we waste
time and memory cache lines.

For example, on this ProLiant BL460c G6, with two quad core cpus
(2x4x2), 32 'possible' cpus are found, instead of 16 :

setup_percpu: NR_CPUS:64 nr_cpumask_bits:64 nr_cpu_ids:32 nr_node_ids:2
PERCPU: Embedded 26 pages/cpu @ffff88007dc00000 s76288 r8192 d22016 u131072
pcpu-alloc: s76288 r8192 d22016 u131072 alloc=1*2097152
pcpu-alloc: [0] 00 02 04 06 08 10 12 14 17 19 21 23 25 27 29 31
pcpu-alloc: [1] 01 03 05 07 09 11 13 15 16 18 20 22 24 26 28 30



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/