RE: [PATCH] selftests/resctrl: Skip MBM&CMT tests when Intel Sub-NUMA

From: tan.shaopeng@xxxxxxxxxxx
Date: Mon Nov 15 2021 - 02:19:39 EST


Hi Reinette,

> On 11/10/2021 12:27 AM, Shaopeng Tan wrote:
> > From: "Tan, Shaopeng" <tan.shaopeng@xxxxxxxxxxxxxx>
> >
> > When the Intel Sub-NUMA Clustering(SNC) feature is enabled,
> > the CMT and MBM counters may not be accurate.
> > In this case, skip MBM&CMT tests.
> >
> > Signed-off-by: Shaopeng Tan <tan.shaopeng@xxxxxxxxxxxxxx>
> > ---
> > Hello,
> >
> > According to the Intel RDT reference Manual,
> > when the sub-numa clustering feature is enabled, the CMT and MBM
> counters may not be accurate.
> > When running CMT tests and MBM tests on Intel processor, the result is "not
> ok".
> > So, fix it to skip the CMT & MBM test When the Intel Sub-NUMA
> Clustering(SNC) feature is enabled.
> >
>
> It is not clear to me which exact document you refer to but I did find a
> RDT reference manual at the link below that describes the problem you
> mention:
> https://www.intel.com/content/dam/develop/external/us/en/documents/18
> 0115-intel-rdtcascadelake-serverreferencemanual-806717.pdf

Yes, I referred this manual.

> What is not mentioned in your description is that this is a hardware
> errata so the test is expected to fail on these systems and I find that
> disabling the test for all systems based on this hardware errata is too
> drastic.

Understood. It is not reasonable to disable the test for all systems
based on this hardware errata.
When I run restrl_test on Intel(R) Xeon(R) Gold 6254 CPU,
the result of CMT & MBM is "not ok", and I took some time to debug it.
In order to other people can do the test smoothly, I'd like to update the
patch to disable the test only on 2nd Generation Intel Xeon scalable processors.

Regards,
Shaopeng Tan