Re: [tip: sched/core] sched: Add cluster scheduler level for x86

From: Tom Lendacky
Date: Wed Oct 20 2021 - 16:40:26 EST


On 10/20/21 3:36 PM, Peter Zijlstra wrote:
On Wed, Oct 20, 2021 at 10:25:42PM +0200, Peter Zijlstra wrote:
On Wed, Oct 20, 2021 at 03:08:41PM -0500, Tom Lendacky wrote:
On 10/20/21 2:51 PM, Peter Zijlstra wrote:
On Wed, Oct 20, 2021 at 08:12:51AM -0500, Tom Lendacky wrote:
On 10/15/21 4:44 AM, tip-bot2 for Tim Chen wrote:
The following commit has been merged into the sched/core branch of tip:


If it does boot, what does something like:

for i in /sys/devices/system/cpu/cpu*/topology/*{_id,_list}; do echo -n "${i}: " ; cat $i; done

produce?

The output is about 160K in size, I'll email it to you off-list.

/sys/devices/system/cpu/cpu0/topology/cluster_cpus_list: 0
/sys/devices/system/cpu/cpu0/topology/core_cpus_list: 0,128

/sys/devices/system/cpu/cpu128/topology/cluster_cpus_list: 128
/sys/devices/system/cpu/cpu128/topology/core_cpus_list: 0,128

So for some reason that thing thinks each SMT thread has it's own L2,
which seems rather unlikely. Or SMT has started to mean something
radically different than it used to be :-)

Let me continue trying to make sense of cacheinfo.c

OK, I think I see what's happening.

AFAICT cacheinfo.c does *NOT* set l2c_id on AMD/Hygon hardware, this
means it's set to BAD_APICID.

This then results in match_l2c() to never match. And as a direct
consequence set_cpu_sibling_map() will generate cpu_l2c_shared_mask with
just the one CPU set.

And we have the above result and things come unstuck if we assume:
SMT <= L2 <= LLC

Now, the big question, how to fix this... Does AMD have means of
actually setting l2c_id or should we fall back to using match_smt() for
l2c_id == BAD_APICID ?

Let me include Suravee, who has more experience with our topology and cache information.

Thanks,
Tom