Re: [RFC] AutoNUMA alpha6

From: Andrea Arcangeli
Date: Thu Mar 22 2012 - 14:50:18 EST


Hi Dan,

On Thu, Mar 22, 2012 at 03:27:35PM +0100, Andrea Arcangeli wrote:
> current code would optimally perform, if all nodes are busy and there
> aren't idle cores (or only idle siblings). I guess I'll leave the HT
> optimizations for later. I probably shall measure this again with HT off.

I added the latest virt measurement with KVM for kernel build and
memhog. I also measured how much I'd save by increasing the
knuma_scand pass frequency (scan_sleep_pass_millisecs) from 10sec
default (5000 value) to 30sec. I also tried 1min but it was within
error range of 30sec. 10sec -> 30sec is also almost within error range
showing the cost is really tiny. Luckily the numbers were totally
stable by running a -j16 loop on both VM (each VM had 12 vcpus on a
host with 24 CPUs) and the error was less than 1sec for each kernel
build (on tmpfs obviously and totally stripped down userland in both
guest and host).

http://www.kernel.org/pub/linux/kernel/people/andrea/autonuma/autonuma_bench-20120322.pdf

slide 11 and 12.

This is with THP on, with THP off things would be different likely but
hey THP off is like 20% slower or more on a kernel build in guest in
the first place.

I'm satisfied with the benchmarks results so far and more will come
soon, but now it's time to go back coding and add THP native
migration. That will benefit everyone, from cpuset in userland to
numa/sched.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/