Re: [PATCH 3/3] x86/resctrl: Display cache occupancy of busy RMIDs

From: Reinette Chatre
Date: Wed Jan 24 2024 - 17:25:24 EST


(+James)

Hi Haifeng,

On 1/23/2024 1:20 AM, Haifeng Xu wrote:
> If llc_occupany is enabled, the RMID may not be freed immediately unless
> its llc_occupany is less than the resctrl_rmid_realloc_threshold.
>
> In our production environment, those unused RMIDs get stuck in the limbo
> list forever because their llc_occupancy are larger than the threshold.
> After turning it up , we can successfully free unused RMIDs and create
> new monitor groups. In order to accquire the llc_occupancy of RMIDs in
> each rdt domain, we use perf tool to track and filter the log manually.
>
> It's not efficient enough. Therefore, we can add a RFTYPE_TOP_INFO file
> 'busy_rmids_info' that tells users the llc_occupancy of busy RMIDs. It
> can also help to guide users how much the resctrl_rmid_realloc_threshold
> should be.

I am addressing both patch 2/3 and patch 3/3 here.

First, please note that resctrl is obtaining support for Arm's Memory
System Resource Partitioning and Monitoring (MPAM) and MPAM's monitoring
is done with a monitoring group that is dependent on the control group,
not independent as Intel and AMD. Please see [1] for more details.

resctrl is the generic interface that will be used to interact with RDT
on Intel, PQoS on AMD, and also MPAM on Arm. We thus need to ensure that
the interface is appropriate for all. Specifically, for Arm there is
no global "free RMID list", on Arm the free RMIDs (PMG in Arm language,
but rmid is the term that made it into resctrl) are per control group.

Second, this addition seems to be purely a debugging aid. I thus don't see
this as something that users may want/need all the time, yet when users do
want/need it, accurate data is preferred. To that end, the limbo
code already walks the busy list once per second. What if there is a
new tracepoint within the limbo code that shares the exact data used during
limbo list management? From what I can tell, this data, combined with the
per-monitor-group "mon_hw_id", should give user space sufficient data to
debug the scenarios mentioned in these patches.

I did add James to this discussion to make him aware of your requirements.
Please do include him in future submissions.

Reinette

[1] https://lore.kernel.org/all/20231215174343.13872-1-james.morse@xxxxxxx/