Re: Scheduler regression from caffcdd8d27ba78730d5540396ce72ad022aff2c
From: Dietmar Eggemann
Date: Thu Jul 17 2014 - 04:58:13 EST
On 17/07/14 05:09, Bruno Wolff III wrote:
On Thu, Jul 17, 2014 at 01:18:36 +0200,
Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
So the output of
$ cat /proc/sys/kernel/sched_domain/cpu*/domain*/*
would be handy too.
Thanks, this was helpful.
I see from the sched domain layout that you have SMT (domain0) and DIE
(domain1) level. So on this system, the MC level gets degenerated
(sd_degenerate() in kernel/sched/core.c).
I fail so far to see how this can have an effect on the memory of the
sched groups. But I can try to fake this situation on one of my platforms.
There is also the possibility that the memory for sched_group sg is not
(completely) zeroed out:
sg = kzalloc_node(sizeof(struct sched_group) + cpumask_size(),
GFP_KERNEL, cpu_to_node(j));
struct sched_group {
...
* NOTE: this field is variable length. (Allocated dynamically
* by attaching extra space to the end of the structure,
* depending on how many CPUs the kernel has booted up with)
*/
unsigned long cpumask[0];
};
so that the cpumask of a sched group is not 0 and can only be cured by
an explicit cpumask_clear(sched_group_cpus(sg)) in build_sched_groups()
on this kind of machine.
Attached and added to the bug.
Just to make sure, you do have 'CONFIG_X86_32=y' and '# CONFIG_NUMA is
not set' in your build?
Yes.
I probably won't be able to get /proc/schedstat on my next test since the
system will probably crash right away. However, I probably will have a
much faster rebuild and might still be able to get the info for you
before I leave tomorrow.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/