RE: [PATCH 1/2] x86/CPU/AMD: Present package as die instead of socket

From: Duran, Leo
Date: Tue Jun 27 2017 - 11:49:21 EST


Hi Thomas, et al,

Just a quick comment below.
Leo.


> -----Original Message-----
> From: Thomas Gleixner [mailto:tglx@xxxxxxxxxxxxx]
> Sent: Tuesday, June 27, 2017 9:21 AM
> To: Suthikulpanit, Suravee <Suravee.Suthikulpanit@xxxxxxx>
> Cc: Borislav Petkov <bp@xxxxxxxxx>; x86@xxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; Duran, Leo <leo.duran@xxxxxxx>; Ghannam,
> Yazen <Yazen.Ghannam@xxxxxxx>; Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Subject: Re: [PATCH 1/2] x86/CPU/AMD: Present package as die instead of
> socket
>
> On Tue, 27 Jun 2017, Suravee Suthikulpanit wrote:
> > On 6/27/17 17:48, Borislav Petkov wrote:
> > > On Tue, Jun 27, 2017 at 01:40:52AM -0500, Suravee Suthikulpanit wrote:
> > > > However, this is not the case on AMD family17h multi-die processor
> > > > platforms, which can have up to 4 dies per socket as shown in the
> > > > following system topology.
> > >
> > > So what exactly does that mean? A die is a package on ZN and you can
> > > have up to 4 packages on a physical socket?
> >
> > Yes. 4 packages (or 4 dies, or 4 NUMA nodes) in a socket.
>
> And why is this relevant at all?
>
> The kernel does not care about sockets. Sockets are electromechanical
> components and completely irrelevant.
>
> The kernel cares about :
>
> Threads - Single scheduling unit
>
> Cores - Contains one or more threads
>
> Packages - Contains one or more cores. The cores share L3.
>
> NUMA Node - Contains one or more Packages which share a memory
> controller.
>
> I'm not aware of x86 systems which have several Packages
> sharing a memory controller, so Package == NUMA Node
> (but I might be wrong here).
>
> Platform - Contains one or more Numa Nodes
[Duran, Leo]
That is my understanding of intent as well... However, regarding the L3:

The sentence 'The cores share L3.' under 'Packages' may give the impression that all cores in a package share an L3.
In our case, we define a Package a group of cores sharing a memory controller, a 'Die' in hardware terms.
Also, it turns out that within a Package we may have separate groups of cores each having their own L3 (in hardware terms we refer to those as a 'Core Complex').

Basically, in our case a Package may contain more than one L3 (i.e., in hardware terms, there may more than one 'Core complex' in a 'Die').
The important point is that all logical processors (threads) that share an L3 have a common "cpu_llc_id".

>
> All the kernel is interested in is the above and the NUMA Node distance so it
> knows about memory access latencies. No sockets, no MCMs, that's all
> completely useless for the scheduler.
>
> So if the current CPUID stuff gives you the same phycial package ID for all
> packages in a MCM, then this needs to be fixed at the CPUID/ACPI/BIOS
> level and not hacked around in the kernel.
>
> The only reason why a MCM might need it's own ID is, when it contains
> infrastructure which is shared between the packages, but again that's
> irrelevant for the scheduler. That'd be only relevant to implement a driver for
> that shared infrastructure.
>
> Thanks,
>
> tglx