Re: [RFC PATCH 2/2] resctrl2: Arch x86 modules for most of the legacy control/monitor functions

From: Peter Newman
Date: Tue Jul 04 2023 - 08:44:39 EST


Hi Tony,

On Tue, Jun 20, 2023 at 5:37 AM Tony Luck <tony.luck@xxxxxxxxx> wrote:
> +struct rmid {
> + struct list_head list;
> + struct list_head child_list;
> + bool is_parent;
> +static void __rdt_rmid_read(void *info)
> +{
> + struct rrmid_info *rr = info;
> + unsigned long flags;
> + struct rmid *cr, *r;
> + struct mydomain *m;
> + u64 chunks;
> +
> + m = get_mydomain(rr->domain);
> +
> + if (rr->event <= EV_LOC) {
> + spin_lock_irqsave(&m->msr_lock, flags);

Will there ultimately be any locking at the filesystem layer? I recall
from feedback on my change adding a spinlock here[1] before that the
filesystem-layer locking took care of this.

> + wrmsrl(MSR_IA32_QM_EVTSEL, (rr->rmid << 32) | rr->event);
> + rdmsrl(MSR_IA32_QM_CTR, chunks);
> + } else {
> + chunks = 0;
> + }
> +
> + rr->chunks = adjust(m, rr->rmid, rr->event, chunks);
> +
> + r = &rmid_array[rr->rmid];
> + if (r->is_parent && !list_empty(&r->child_list)) {
> + list_for_each_entry(cr, &r->child_list, child_list) {
> + u64 crmid = cr - rmid_array;
> +
> + if (rr->event <= EV_LOC) {
> + wrmsrl(MSR_IA32_QM_EVTSEL, (crmid << 32) | rr->event);
> + rdmsrl(MSR_IA32_QM_CTR, chunks);
> + } else {
> + chunks = 0;
> + }
> +
> + rr->chunks += adjust(m, crmid, rr->event, chunks);
> + }
> + }
> +
> + if (rr->event <= EV_LOC)
> + spin_unlock_irqrestore(&m->msr_lock, flags);
> +}
> +
> +u64 rdt_rmid_read(int domain_id, int rmid, int event)
> +{
> + struct resctrl_domain *d;
> + struct rrmid_info rr;
> + struct mydomain *m;
> +
> + list_for_each_entry(d, &monitor.domains, list)
> + if (d->id == domain_id)
> + goto found;
> + return ~0ull;
> +found:
> + m = get_mydomain(d);
> +
> + rr.domain = d;
> + rr.rmid = rmid;
> + rr.event = event;
> +
> + if (event <= EV_LOC)
> + smp_call_function_any(&d->cpu_mask, __rdt_rmid_read, &rr, 1);
> + else
> + __rdt_rmid_read(&rr);

I like that the driver is responsible for deciding where IPIs need to
be sent, but it looks like the consequence is that RDT-level code
wants to add in the child monitors' event counts once executing within
the correct domain. The one-per-domain IPI assumption from the current
resctrl code being wrong is probably harder to overcome than needing
to figure out what additional RMIDs to read, but I'd really need to
know the synchronization requirements for __rdt_rmid_read() to inspect
the monitoring group hierarchy.

Would you continue to promise that the FS structure won't change
during a monitor read? To us, the biggest priority for
parallelization is reading all the domain-group combinations in the
system, because we have a lot of them and want the tightest possible
snapshot of bandwidth usage, broken down by group.

Thanks!
-Peter

[1] https://lore.kernel.org/all/242db225-8ddc-968e-a754-6aaefd1b7da9@xxxxxxxxx/