Re: [PATCH v15-RFC 6/8] x86/resctrl: Introduce snc_nodes_per_l3_cache

From: Tony Luck
Date: Fri Feb 09 2024 - 14:35:24 EST


On Fri, Feb 09, 2024 at 09:29:16AM -0600, Moger, Babu wrote:
> >
> > +extern unsigned int snc_nodes_per_l3_cache;
>
> I feel this can be part of rdt_resource instead of global.

Mixed emotions about that. It would be another field that appears
in every instance of rdt_resource, but only used by the RDT_RESOURCE_L3_MON
copy.

>
> > +
> > enum resctrl_res_level {
> > RDT_RESOURCE_L3_MON,
> > RDT_RESOURCE_L3,
> > diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> > index b741cbf61843..dc886d2c9a33 100644
> > --- a/arch/x86/kernel/cpu/resctrl/core.c
> > +++ b/arch/x86/kernel/cpu/resctrl/core.c
> > @@ -48,6 +48,12 @@ int max_name_width, max_data_width;
> > */
> > bool rdt_alloc_capable;
> >
> > +/*
> > + * Number of SNC nodes that share each L3 cache. Default is 1 for
> > + * systems that do not support SNC, or have SNC disabled.
> > + */
> > +unsigned int snc_nodes_per_l3_cache = 1;
> > +
> > static void
> > mba_wrmsr_intel(struct rdt_domain *d, struct msr_param *m,
> > struct rdt_resource *r);
> > diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> > index 080cad0d7288..357919bbadbe 100644
> > --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> > +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> > @@ -148,8 +148,18 @@ static inline struct rmid_entry *__rmid_entry(u32 rmid)
> >
> > static int __rmid_read(u32 rmid, enum resctrl_event_id eventid, u64 *val)
> > {
> > + struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
>
> RDT_RESOURCE_L3_MON?

Second good catch.

>
> > + int cpu = smp_processor_id();
> > + int rmid_offset = 0;
> > u64 msr_val;
> >
> > + /*
> > + * When SNC mode is on, need to compute the offset to read the
> > + * physical RMID counter for the node to which this CPU belongs.
> > + */
> > + if (snc_nodes_per_l3_cache > 1)
> > + rmid_offset = (cpu_to_node(cpu) % snc_nodes_per_l3_cache) * r->num_rmid;
>
> Not sure if you have tested or not. r->num_rmid is initialized for the
> resource RDT_RESOURCE_L3_MON. For other resource it is always 0.

I hadn't got time on the SNC machine to try this out. Thanks
for catching this one, I'd have been scratching my head for a
while to track the symptoms of this problem back to this mistake.

Thanks

-Tony