Re: [PATCH v5 06/24] x86/resctrl: Track the number of dirty RMID a CLOSID has

From: James Morse
Date: Thu Aug 24 2023 - 12:54:13 EST


Hi Reinette,

On 09/08/2023 23:33, Reinette Chatre wrote:
> On 7/28/2023 9:42 AM, James Morse wrote:
>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index de91ca781d9f..44addc0126fc 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -43,6 +43,13 @@ struct rmid_entry {
>> */
>> static LIST_HEAD(rmid_free_lru);
>>
>> +/**
>> + * @closid_num_dirty_rmid The number of dirty RMID each CLOSID has.
>> + * Only allocated when CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID is defined.
>> + * Indexed by CLOSID. Protected by rdtgroup_mutex.
>> + */
>> +static int *closid_num_dirty_rmid;
>> +
>
> Will the values ever be negative?

Nope, int is just fewer keystrokes. I'll change it to unsigned int.


>> /**
>> * @rmid_limbo_count count of currently unused but (potentially)
>> * dirty RMIDs.


>> @@ -782,13 +802,28 @@ void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms)
>> static int dom_data_init(struct rdt_resource *r)
>> {
>> u32 idx_limit = resctrl_arch_system_num_rmid_idx();
>> + u32 num_closid = resctrl_arch_get_num_closid(r);
>> struct rmid_entry *entry = NULL;
>> u32 idx;
>> int i;
>>
>> + if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID)) {
>> + int *tmp;
>> +
>> + tmp = kcalloc(num_closid, sizeof(int), GFP_KERNEL);
>> + if (!tmp)
>> + return -ENOMEM;
>> +
>> + mutex_lock(&rdtgroup_mutex);
>> + closid_num_dirty_rmid = tmp;
>> + mutex_unlock(&rdtgroup_mutex);
>> + }
>> +
>
> It does no harm but I cannot see why the mutex is needed here.

It's belt-and-braces to ensure that all accesses to that global variable are protected by
that lock. This avoids giving me a memory ordering headache.
rmid_ptrs and the call to __rmid_entry() that dereferences it should probably get the same
treatment.

I'll move the locking to the caller as the least-churny way of covering both.


>> rmid_ptrs = kcalloc(idx_limit, sizeof(struct rmid_entry), GFP_KERNEL);
>> - if (!rmid_ptrs)
>> + if (!rmid_ptrs) {
>> + kfree(closid_num_dirty_rmid);
>> return -ENOMEM;
>> + }
>>
>> for (i = 0; i < idx_limit; i++) {
>> entry = &rmid_ptrs[i];
>
> How will this new memory be freed? Actually I cannot find where
> rmid_ptrs is freed either .... is a "dom_data_free()" needed?

Oh that's not deliberate? :P

rmid_ptrs has been immortal since the beginning. The good news is resctrl_exit() goes in
the exitcall section, which is in the DISCARDS section of the linker script as resctrl
can't be built as a module. It isn't possible to tear resctrl down, so no-one will notice
this leak.

Something on my eternal-todo-list is to make the filesystem parts of resctrl a loadable
module (if Tony doesn't get there first!). That would flush this sort of thing out.
Last time I triggered resctrl_exit() manually not all of the files got cleaned up - I
haven't investigated it further.


I agree it should probably have a kfree() call somewhere under rdtgroup_exit(), as its
only the L3 that needs any of this, I'll add resctrl_exit_mon_l3_config() for
rdtgroup_exit() to call.

Another option is to rip out all the __exit text as its discarded anyway. But if loadable
modules is the direction of travel, it probably make more sense to fix it.


Thanks,

James