Re: [PATCH v2 00/17] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC)

From: Moger, Babu
Date: Mon Mar 04 2024 - 14:34:43 EST


Hi Reinette,

On 3/1/2024 5:20 PM, Reinette Chatre wrote:
Hi Babu,

On 3/1/2024 12:36 PM, Moger, Babu wrote:
On 2/29/24 15:50, Reinette Chatre wrote:
On 2/29/2024 12:37 PM, Moger, Babu wrote:
...

To make more clear, let me list all the groups here based this.

When none of the counters assigned:

$cat /sys/fs/resctrl/info/L3_MON/mbm_assign_control
resctrl/00=none,none;01=none,none (#default control_mon group)
resctrl/mon_a/00=none,none;01=none,none (#mon group)
resctrl/ctrl_a/00=none,none;01=none,none (#control_mon group)
resctrl/ctrl_a/mon_ab/00=none,none;01=none,none (#mon group)
I am concerned that inconsistent use of "/" will make parsing hard.
Do you mean, you don't want to see multiple "/"?
No. I think that having a consistent number of "/" will be easier to
parse. In the above example, there are instances of 1, 2, as well as
three "/" among the lines. That seems complicated to parse.

I was thinking that it will make interpreting and parsing easier if there
consistently are just always two "/".

(You may find things to be different once you work on the parsing code
though.)

In summary:
* for monitoring of default CTRL_MON group: "//<flags>"
* for MON_GROUP inside default CTRL_MON group: "/<MON group>/<flags>"
* for monitoring of non-default CTRL_MON group: "<CTRL_MON group>//flags"
* for MON_GROUP within CTRL_MON group: "<CTRL_MON group>/<MON group>/<flags>"

What do you think?
Looks like you tried to keep two "/" in all the options. Looks good most part. Will keep the options open for changes when we start implementing.

resctrl/ctrl_a/mon_ab/

Change to

mon_ab/
rather:
ctrl_a/mon_ab/<flags>
Sure.

I find "resctrl" and all the "none" redundant. It is not clear what
this improves.
Why have:
resctrl/00=none,none;01=none,none
when this could do:
//00=_;01=_
ok.

"//" meaning root of resctrl filesystem?
More specifically, monitoring of default control group. It is not intended to
specify a pathname.

When some counters are assigned:

$echo "resctrl/00=total,local" >
/sys/fs/resctrl/info/L3_MON/mbm_assign_control (#assigning counter to
default group)

$echo "resctrl/mon_a/00=total;01=total" >
/sys/fs/resctrl/info/L3_MON/mbm_assign_control (#assigning counter to mon
group)

$echo "resctrl/ctrl_a/00=local;01=local" >
/sys/fs/resctrl/info/L3_MON/mbm_assign_control

$echo "resctrl/ctrl_a/mon_ab/00=total,local;01=total,local" >
/sys/fs/resctrl/nfo/L3_MON/mbm_assign_control

We could learn some more lessons from dynamic debug (see Documentation/admin-guide/dynamic-debug-howto.rst). For example, "=" can be used to make an assignment while "+"
can be used to add a counter and "-" can be used to remove a counter.
"=_" can be used to remove counters from all events in that domain.
Yes. Looked at dynamic debug. I am still learning this interface. Some examples below based on my understanding.

To assign a counters to default group on domain 0.
$echo "//00=+lt;01=+lt" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
It should not be necessary to use both "=" and "+" in the same assignment.
I think of "=" as "assign" and "+" as append ("-" as remove).
Here are our options.

a. assign one event (+)

b. unassign one event (-)

c. assign both (++ may be?)

d. unassign both (_)

I think append ( "=") is not required while assigning.  It is confusing.

Assign or Add both involve same action.

How about this? This might be easy to parse. May be space (" ") after the domain id.

<group>/<domain id> <action><event>; <domain id> <action><event>


An example of this, just focusing on default group.

#hypothetical start state of no counters assigned
$ cat mbm_assign_control
#control_group/monitor_group/flags
//00=_;01=_
Looks good.

#assign counter to total MBM of both domains
$ echo "//00=t;01=t" > mbm_assign_control
There is no difference in assign or add. Just add total MBM event.

$ echo "//00 +t;01 +t" > mbm_assign_control

$ cat mbm_assign_control
#control_group/monitor_group/flags
//00=t;01=t
good.

#add counter to local MBM of both domains without impacting total MBM counters
$echo "//00+l;01+l" > mbm_assign_control
It is not required to know about whether total MBM event  is already assigned or not.  Just assign the event of your interest. If it is already assigned then kernel just ignores it. Kernel has information all the assignment status.

$echo "//00 +l;01 +l" > mbm_assign_control

We will know the full status of the assignment when we list again.

$ cat mbm_assign_control
#control_group/monitor_group/flags
//00=tl;01=tl
Good.

#remove local MBM counters without impacting total MBM counters
$echo "//00-l;01-l" > mbm_assign_control
Remove local MBM counters. We don't need to know about total MBM counter.

$echo "//00 -l;01 -l" > mbm_assign_control

$ cat mbm_assign_control
#control_group/monitor_group/flags
//00=t;01=t
Good.

#assign local MBM counters, removing total MBM counters while doing so
$echo "//00=l;01=l" > mbm_assign_control
Again confusing here.  Just remove total event and add local event in two commands.

$echo "//00 -t;01 -t" > mbm_assign_control
$echo "//00 +l;01 +l" > mbm_assign_control

$ cat mbm_assign_control
#control_group/monitor_group/flags
//00=l;01=l
good

#remove all counters
$echo "//00=_;01=_" > mbm_assign_control
$ cat mbm_assign_control
#control_group/monitor_group/flags
//00=_;01=_


To assign a counters to mon group inside the default group.
$echo "mon_a/00=+t;01=+t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
I think it will simplify parsing if number of "/" is consistent:
$echo "/mon_a/00=t;01=t" > ...

To assign a counters to control mon group inside the default group.
It is not clear to me what you mean with this.

$echo "ctrl_a/00=+l;01=+l"  > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
this looks similar to previous example, so I think it will be hard for parser
to know whether it is dealing with control group or monitor group.
I am not sure I understand your example, but this may perhaps be:

echo "ctrl_a//00=l;01=l > ...

To assign a counters to control mon group inside another control group.
I do not know what you mean with "another control group"

$echo "mon_ab/00=+lt;01=+lt" > /sys/fs/resctrl/nfo/L3_MON/mbm_assign_contro
How will parser know which control group? I was expecting:
$ echo "<CTRL_MON group>/<MON group>/<flags>"
Sure.

To unassign a counters to control mon group inside another control group.
$echo "mon_ab/00=-lt;01=-lt" > /sys/fs/resctrl/nfo/L3_MON/mbm_assign_control

To unassign all the counters on a specific group.
$echo "mon_ab/00=_" > /sys/fs/resctrl/nfo/L3_MON/mbm_assign_control

It does not matter control group or mon group. We just need to name of the group in this interface.
It matters because users can have monitor groups with the same name within
different control groups.

Agree.

I will list all the example again once we agree on specific format.

Thanks

Babu