Re: [PATCH v5 24/24] x86/resctrl: Separate arch and fs resctrl locks

From: Reinette Chatre
Date: Wed Aug 09 2023 - 18:41:36 EST


Hi James,

On 7/28/2023 9:42 AM, James Morse wrote:
> resctrl has one mutex that is taken by the architecture specific code,
> and the filesystem parts. The two interact via cpuhp, where the
> architecture code updates the domain list. Filesystem handlers that
> walk the domains list should not run concurrently with the cpuhp
> callback modifying the list.
>
> Exposing a lock from the filesystem code means the interface is not
> cleanly defined, and creates the possibility of cross-architecture
> lock ordering headaches. The interaction only exists so that certain
> filesystem paths are serialised against cpu hotplug. The cpu hotplug

cpu hotplug -> CPU hotplug

> code already has a mechanism to do this using cpus_read_lock().
>
> MPAM's monitors have an overflow interrupt, so it needs to be possible
> to walk the domains list in irq context. RCU is ideal for this,
> but some paths need to be able to sleep to allocate memory.
>
> Because resctrl_{on,off}line_cpu() take the rdtgroup_mutex as part
> of a cpuhp callback, cpus_read_lock() must always be taken first.
> rdtgroup_schemata_write() already does this.
>
> Most of the filesystem code's domain list walkers are currently
> protected by the rdtgroup_mutex taken in rdtgroup_kn_lock_live().
> The exceptions are rdt_bit_usage_show() and the mon_config helpers
> which take the lock directly.
>
> Make the domain list protected by RCU. An architecture-specific
> lock prevents concurrent writers. rdt_bit_usage_show() can
> walk the domain list under rcu_read_lock(). The mon_config helpers
> send multiple IPIs, take the cpus_read_lock() in these cases.
>
> The other filesystem list walkers need to be able to sleep.
> Add cpus_read_lock() to rdtgroup_kn_lock_live() so that the
> cpuhp callbacks can't be invoked when file system operations are
> occurring.
>
> Add lockdep_assert_cpus_held() in the cases where the
> rdtgroup_kn_lock_live() call isn't obvious.
>
> Resctrl's domain online/offline calls now need to take the
> rdtgroup_mutex themselves.
>
> Tested-by: Shaopeng Tan <tan.shaopeng@xxxxxxxxxxx>
> Signed-off-by: James Morse <james.morse@xxxxxxx>

...

> @@ -464,6 +467,9 @@ static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int clo
> bool sep = false;
> u32 ctrl_val;
>
> + /* Walking r->domains, ensure it can't race with cpuhp */
> + lockdep_assert_cpus_held();
> +
> seq_printf(s, "%*s:", max_name_width, schema->name);
> list_for_each_entry(dom, &r->domains, list) {
> if (sep)
> @@ -534,8 +540,8 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
> {
> int cpu;
>
> - /* When picking a CPU from cpu_mask, ensure it can't race with cpuhp */
> - lockdep_assert_held(&rdtgroup_mutex);
> + /* When picking a cpu from cpu_mask, ensure it can't race with cpuhp */

cpu -> CPU

> + lockdep_assert_cpus_held();
>
> /*
> * Setup the parameters to pass to mon_event_count() to read the data.

...

> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index a256a96df487..47dcf2cb76ca 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -35,6 +35,10 @@
> DEFINE_STATIC_KEY_FALSE(rdt_enable_key);
> DEFINE_STATIC_KEY_FALSE(rdt_mon_enable_key);
> DEFINE_STATIC_KEY_FALSE(rdt_alloc_enable_key);
> +
> +/* Mutex to protect rdtgroup access. */
> +DEFINE_MUTEX(rdtgroup_mutex);
> +
> static struct kernfs_root *rdt_root;
> struct rdtgroup rdtgroup_default;
> LIST_HEAD(rdt_all_groups);
> @@ -954,7 +958,8 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of,
>
> mutex_lock(&rdtgroup_mutex);
> hw_shareable = r->cache.shareable_bits;
> - list_for_each_entry(dom, &r->domains, list) {
> + rcu_read_lock();
> + list_for_each_entry_rcu(dom, &r->domains, list) {
> if (sep)
> seq_putc(seq, ';');
> sw_shareable = 0;

Does rdt_bit_usage_show() really need RCU? It is another filesystem callback and I
do not see a reason why it should access the domain list in a different way. It
can follow the same pattern as all the other resctrl filesystem ops and use
cpus_read_lock().

> @@ -1010,8 +1015,10 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of,
> }
> sep = true;
> }
> + rcu_read_unlock();
> seq_putc(seq, '\n');
> mutex_unlock(&rdtgroup_mutex);
> +

Unnecessary empty line.


> return 0;
> }


Reinette