Re: [PATCH v5 24/24] x86/resctrl: Separate arch and fs resctrl locks

From: James Morse
Date: Thu Aug 24 2023 - 12:59:35 EST


Hi Reinette,

On 09/08/2023 23:41, Reinette Chatre wrote:
> On 7/28/2023 9:42 AM, James Morse wrote:
>> resctrl has one mutex that is taken by the architecture specific code,
>> and the filesystem parts. The two interact via cpuhp, where the
>> architecture code updates the domain list. Filesystem handlers that
>> walk the domains list should not run concurrently with the cpuhp
>> callback modifying the list.
>>
>> Exposing a lock from the filesystem code means the interface is not
>> cleanly defined, and creates the possibility of cross-architecture
>> lock ordering headaches. The interaction only exists so that certain
>> filesystem paths are serialised against cpu hotplug. The cpu hotplug
>
> cpu hotplug -> CPU hotplug
>
>> code already has a mechanism to do this using cpus_read_lock().
>>
>> MPAM's monitors have an overflow interrupt, so it needs to be possible
>> to walk the domains list in irq context. RCU is ideal for this,
>> but some paths need to be able to sleep to allocate memory.
>>
>> Because resctrl_{on,off}line_cpu() take the rdtgroup_mutex as part
>> of a cpuhp callback, cpus_read_lock() must always be taken first.
>> rdtgroup_schemata_write() already does this.
>>
>> Most of the filesystem code's domain list walkers are currently
>> protected by the rdtgroup_mutex taken in rdtgroup_kn_lock_live().
>> The exceptions are rdt_bit_usage_show() and the mon_config helpers
>> which take the lock directly.
>>
>> Make the domain list protected by RCU. An architecture-specific
>> lock prevents concurrent writers. rdt_bit_usage_show() can
>> walk the domain list under rcu_read_lock(). The mon_config helpers
>> send multiple IPIs, take the cpus_read_lock() in these cases.
>>
>> The other filesystem list walkers need to be able to sleep.
>> Add cpus_read_lock() to rdtgroup_kn_lock_live() so that the
>> cpuhp callbacks can't be invoked when file system operations are
>> occurring.
>>
>> Add lockdep_assert_cpus_held() in the cases where the
>> rdtgroup_kn_lock_live() call isn't obvious.
>>
>> Resctrl's domain online/offline calls now need to take the
>> rdtgroup_mutex themselves.

>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index a256a96df487..47dcf2cb76ca 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c

>> @@ -954,7 +958,8 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of,
>>
>> mutex_lock(&rdtgroup_mutex);
>> hw_shareable = r->cache.shareable_bits;
>> - list_for_each_entry(dom, &r->domains, list) {
>> + rcu_read_lock();
>> + list_for_each_entry_rcu(dom, &r->domains, list) {
>> if (sep)
>> seq_putc(seq, ';');
>> sw_shareable = 0;
>
> Does rdt_bit_usage_show() really need RCU? It is another filesystem callback and I
> do not see a reason why it should access the domain list in a different way. It
> can follow the same pattern as all the other resctrl filesystem ops and use
> cpus_read_lock().

It doesn't today, and it was useful to have an example where RCU was used.
I'll make this call cpus_read_lock() instead.


Thanks,

James