[PATCH v3 0/2] x86/resctrl: fix task CLOSID update race

From: Peter Newman
Date: Tue Nov 15 2022 - 09:20:21 EST


Hi Reinette, Fenghua,

I've reorganized the patches for clarity, following James's guidance.

The patch series addresses the IPI race we discussed in the container
move RFD thread[1].

The first patch changes group-wide CLOSID/RMID updates to IPI all CPUs.
Now that the synchronization cost of correctly updating a single task is
more than originally thought, we believe that it's cheaper to IPI all
CPUs than forming a more precise CPU mask by synchronizing with all
tasks in an rdtgroup, especially when there is a large number of tasks
in the group. It's possible that this update could upset users who
frequently delete groups with few tasks. If anyone is aware of a use
case that frequently deletes groups, we can consider mitigations.

The second one uses the new task_call_func() interface to serialize
updating closid and rmid with any context switch of the task. AFAICT,
the implementation of this function acts like a mutex with context
switch, but I'm not certain whether it is intended to be one. If this is
not how task_call_func() is meant to be used, I will instead move the
code performing the update under sched/ where it can be done holding the
task_rq_lock() explicitly, as Reinette has suggested before[2].

Updates in v3:
- Split the handling of multi-task and single-task operations into
separate patches, now that they're handled differently.
- Clarify justification in the commit message, including moving some of
it out of inline code comment.
Updates in v2:
- Following Reinette's suggestion: use task_call_func() for single
task, IPI broadcast for group movements.
- Rebased to v6.1-rc4

v1: https://lore.kernel.org/lkml/20221103141641.3055981-1-peternewman@xxxxxxxxxx/
v2: https://lore.kernel.org/lkml/20221110135346.2209839-1-peternewman@xxxxxxxxxx/

Thanks!
-Peter

[1] https://lore.kernel.org/all/CALPaoCg2-9ARbK+MEgdvdcjJtSy_2H6YeRkLrT97zgy8Aro3Vg@xxxxxxxxxxxxxx/
[2] https://lore.kernel.org/lkml/d3c06fa3-83a4-7ade-6b08-3a7259aa6c4b@xxxxxxxxx/

Peter Newman (2):
x86/resctrl: IPI all CPUs for group updates
x86/resctrl: update task closid/rmid with task_call_func()

arch/x86/kernel/cpu/resctrl/rdtgroup.c | 128 +++++++++++--------------
1 file changed, 58 insertions(+), 70 deletions(-)

--
2.38.1.493.g58b659f92b-goog