Re: [PATCH v2 2/2] x86/resctrl: Add tracepoint for llc_occupancy tracking

From: Reinette Chatre
Date: Fri Feb 23 2024 - 14:42:13 EST


(+James)

Hi Haifeng and James,

On 2/21/2024 1:21 AM, Haifeng Xu wrote:
> In our production environment, after removing monitor groups, those unused
> RMIDs get stuck in the limbo list forever because their llc_occupancy are
> always larger than the threshold. But the unused RMIDs can be successfully
> freed by turning up the threshold.
>
> In order to know how much the threshold should be, the following steps can
> be taken to acquire the llc_occupancy of RMIDs in each rdt domain:
>
> 1) perf probe -a '__rmid_read eventid rmid'
> perf probe -a '__rmid_read%return $retval'
> 2) perf record -e probe:__rmid_read -e probe:__rmid_read__return -aR sleep 10
> 3) perf script > __rmid_read.txt
> 4) cat __rmid_read.txt | grep "eventid=0x1 " -A 1 | grep "kworker" > llc_occupnacy.txt
>

The details on how perf can be used was useful during the discussion of this
work but can be omitted from this changelog.

> Instead of using perf tool to track llc_occupancy and filter the log manually,
> it is more convenient for users to use tracepoint to do this work. So add a new
> tracepoint that shows the llc_occupancy of busy RMIDs when scanning the limbo
> list.
>
> Signed-off-by: Haifeng Xu <haifeng.xu@xxxxxxxxxx>
> Suggested-by: Reinette Chatre <reinette.chatre@xxxxxxxxx>
> ---
> arch/x86/kernel/cpu/resctrl/monitor.c | 2 ++
> arch/x86/kernel/cpu/resctrl/trace.h | 13 +++++++++++++
> 2 files changed, 15 insertions(+)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index f136ac046851..1533b1932b49 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -23,6 +23,7 @@
> #include <asm/resctrl.h>
>
> #include "internal.h"
> +#include "trace.h"
>
> struct rmid_entry {
> u32 rmid;
> @@ -302,6 +303,7 @@ void __check_limbo(struct rdt_domain *d, bool force_free)
> }
> }
> crmid = nrmid + 1;
> + trace_mon_llc_occupancy_limbo(nrmid, d->id, val);

This area recently received some changes (you can find the latest on the
x86/cache branch of the tip repo). Please see [1] for a good
description of the new "index". For this tracing to be useful to MPAM
I thus expect that the tracepoint will need to print the MPAM equivalent
to CLOSID, the PARTID. We can refer to this CLOSID/PARTID value as
"ctrl_hw_id".

This snippet can then change to use the new resctrl_arch_rmid_idx_decode()
to learn the "ctrl_hw_id" and "mon_hw_id" and print it as part of
tracepoint:
"ctrl_hw_id=%u mon_hw_id=%u domain=%d llc_occupancy=%llu"

This will be filesystem code so it cannot know how an architecture
treats these numbers. Consequently, this may look strange to x86 users
when ctrl_hw_id will always be X86_RESCTRL_EMPTY_CLOSID ... but it should
be clear that it is invalid?

James, what do you think? Any thoughts on how MPAM will use the limbo handler
to understand what information can be useful to the user here?

Reinette

[1] https://lore.kernel.org/lkml/20240213184438.16675-7-james.morse@xxxxxxx/