Re: [patch V3 27/40] x86/cpu: Provide a sane leaf 0xb/0x1f parser

From: Zhang, Rui
Date: Mon Aug 14 2023 - 04:27:19 EST


On Sun, 2023-08-13 at 17:04 +0200, Thomas Gleixner wrote:
> On Sat, Aug 12 2023 at 22:04, Thomas Gleixner wrote:
> > On Sat, Aug 12 2023 at 08:21, Rui Zhang wrote:
> > > With this, we can guarantee that all the available topology
> > > information
> > > are always valid, even when running on future platforms.
> >
> > I know that it can be made work, but is it worth the extra effort?
> > I
> > don't think so.
>
> So I thought more about it. For intermediate levels, i.e. something
> which is squeezed between two existing levels, this works by some
> definition of works.

this "some definition of works" includes parsing the unknown levels,
right?

>
> I.e. the example where we have UBER_TILE between TILE and DIE, then
> we'd
> set and propagate the UBER_TILE entry into the DIE slot and then
> overwrite it again, if there is a DIE entry too.

Well, not really.

If we have TILE/UBER_TILE/DIE in CPUID but only support TILE/DIE in
kernel, the UBER_TILE information is overwritten.

But, UBER_TILE tells us the starting bit in APIC ID for die_id.

Say,
level type eax.shifts
0 SMT 1
1 CORE 5
2 TILE 7
3 UBER_TILE 8
4 DIE 9

This is a 1 package system with 2 dies, each die has 2 uber_tiles and
each uber_tile has 2 tiles.

If we don't support uber_tile, what we want to see is a platform with 2
dies and each die has 4 tiles.

But topo_shift_apicid() uses x86_topo_system.dom_shifts[TILE], so what
we see is a platform with 4 dies, and each die has 2 tiles. And this is
broken.

IMO, what we really need for each domain in x86_topo_system is dom_size
and dom_offset (id bit offset in APIC ID). and when parsing domain A,
we can propagate its eax.shifts to the dom_offset of its upper level
domains.

With this, we set dom_offset[DIE] to 7 first when parsing TILE, and
then overwrite it to 8 when parsing UBER_TILE, and set
dom_offset[PACKAGE] to 9 when parsinig DIE.

lossing TILE.eax.shifts is okay, because it is for UBER_TILE id.

>
> Where it becomes interesting is when the unknown level is past
> DIEGRP,
> e.g. DIEGRP_CONGLOMORATE then we'd need to overwrite the DIEGRP
> level,
> right?
>
> It can be done, but I don't know whether it buys us much for the
> purely
> theoretical case of new levels added.
>
>
Similar to previous case, DIEGRP_CONGLOMORATE eax.shifts can be
propagated to dom_offset[PACKAGE].

But, still, there is one case that we can not handle, (the reason I'm
proposing optional die support in Linux)

Say, we have new level FOO, and the CPUID is like this
level type eax.shifts
0 SMT 1
1 CORE 5
2 FOO 8

This can be a system with
1. 1 die and 8 FOOs if DIE is the upper level of FOO
or
2. 8 FOOs with 1 die in each FOO if DIE is the lower level of FOO

Currently, die topology information is mandatory in Linux, we cannot
make it right without patching enum topo_types/enum
x86_topology_domains/topo_domain_map (which in fact tells the
relationship between DIE and FOO).

thanks,
rui