Re: [PATCH v2] x86/resctrl: Clear the stale staged config after the configuration is completed

From: Shawn Wang
Date: Fri Oct 21 2022 - 04:23:26 EST


Hi Reinette,

On 10/21/2022 12:35 AM, Reinette Chatre wrote:

...

diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 1dafbdc5ac31..2c719da5544f 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -338,6 +338,8 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
                  msr_param.high = max(msr_param.high, idx + 1);
              }
          }
+        /* Clear the stale staged config */
+        memset(d->staged_config, 0, sizeof(d->staged_config));
      }
        if (cpumask_empty(cpu_mask))

Please also ensure that the temporary storage is cleared if there is an
early exist because of failure. Please do not duplicate the memset() code
but instead move it to a common exit location.


There are two different resctrl_arch_update_domains() function call paths:

1.rdtgroup_mkdir()->rdtgroup_mkdir_ctrl_mon()->rdtgroup_init_alloc()->resctrl_arch_update_domains()
2.rdtgroup_schemata_write()->resctrl_arch_update_domains()

Perhaps there is no common exit location if we want to clear staged_config[] after every call of resctrl_arch_update_domains().

I was referring to a common exit out of resctrl_arch_update_domains().

Look at how resctrl_arch_update_domains() behaves with this change:

resctrl_arch_update_domains()
{
...

if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL))
return -ENOMEM;

...
list_for_each_entry(d, &r->domains, list) {
...
memset(d->staged_config, 0, sizeof(d->staged_config));
}


...
done:
free_cpumask_var(cpu_mask);

return 0;
}


The goal of this fix is to ensure that staged_config[] is cleared on
return from resctrl_arch_update_domains() so that there is no stale
data in staged_config[] when resctrl_arch_update_domains() is called
again.

Considering this, I can see two scenarios in the above solution where
staged_config[] is not cleared on exit from resctrl_arch_update_domains():

It may not be enough to just clear staged_config[] when resctrl_arch_update_domains() exits. I think the fix needs to make sure staged_config[] can be cleared where it is set.

The modification of staged_config[] comes from two paths:

Path 1:
rdtgroup_schemata_write() {
...
rdtgroup_parse_resource() // set staged_config[]
...
resctrl_arch_update_domains() // clear staged_config[]
...
}

Path 2:
rdtgroup_init_alloc() {
...
rdtgroup_init_mba()/rdtgroup_init_cat() // set staged_config[]
...
resctrl_arch_update_domains() // clear staged_config[]
...
}

If we clear staged_config[] in resctrl_arch_update_domains(), goto statement for error handling between setting staged_config[] and calling resctrl_arch_update_domains() will be ignored. This can still remain the stale staged_config[].

I think maybe it is better to put the clearing work where rdtgroup_schemata_write() and rdtgroup_init_alloc() exit.

(Sorry, I mistakenly wrote rdtgroup_init_alloc() to rdtgroup_mkdir_ctrl_mon() in my last reply.)

Thank you,

Shawn