Re: [PATCH v4 1/2] x86/resctrl: Update task closid/rmid with task_call_func()

From: Peter Newman
Date: Wed Dec 07 2022 - 05:59:02 EST


Hi Reinette,

On Tue, Dec 6, 2022 at 7:57 PM Reinette Chatre
<reinette.chatre@xxxxxxxxx> wrote:
> On 11/29/2022 3:10 AM, Peter Newman wrote:
> > When the user moves a running task to a new rdtgroup using the tasks
> > file interface, the resulting change in CLOSID/RMID must be immediately
> > propagated to the PQR_ASSOC MSR on the task's CPU.
> >
> > It is possible for a task to wake up or migrate while it is being moved
> > to a new group. If __rdtgroup_move_task() fails to observe that a task
> > has begun running or misses that it migrated to a new CPU, the task will
> > continue to use the old CLOSID or RMID until it switches in again.
> >
> > __rdtgroup_move_task() assumes that if the task migrates off of its CPU
> > before it can IPI the task, then the task has already observed the
> > updated CLOSID/RMID. Because this is done locklessly and an x86 CPU can
> > delay stores until after loads, the following incorrect scenarios are
> > possible:
> >
> > 1. __rdtgroup_move_task() stores the new closid and rmid in
> > the task structure after it loads task_curr() and task_cpu().
>
> Stating how this scenario encounters the problem would help
> so perhaps something like (please feel free to change):
> "If the task starts running between a reordered task_curr() check and
> the CLOSID/RMID update then it will start running with the old CLOSID/RMID
> until it is switched again because __rdtgroup_move_task() failed to determine
> that it needs to be interrupted to obtain the new CLOSID/RMID."

That is largely what I was trying to state in paragraph 2 above, though
at a higher level. I hoped the paragraph following it would do enough to
connect the high-level description with the low-level problem scenarios.

>
> > 2. resctrl_sched_in() loads t->{closid,rmid} before the calling context
> > switch stores new task_curr() and task_cpu() values.
>
> This scenario is not clear to me. Could you please provide more detail about it?
> I was trying to follow the context_switch() flow and resctrl_sched_in() is
> one of the last things done (context_switch()->switch_to()->resctrl_sched_in()).
> From what I can tell rq->curr, as used by task_curr() is set before
> even context_switch() is called ... and since the next task is picked from
> the CPU's runqueue (and set_task_cpu() sets the task's cpu when moved to
> a runqueue) it seems to me that the value used by task_cpu() would also
> be set early (before context_switch() is called). It is thus not clear to
> me how the above reordering could occur so an example would help a lot.

Perhaps in both scenarios I didn't make it clear that reordering in the
CPU can cause the incorrect behavior rather than the program order. In
this explanation, items 1. and 2. are supposed to be completing the
sentence ending with a ':' at the end of paragraph 3, so I thought that
would keep focus on the CPU.

I had assumed that the ordering requirements were well-understood, since
they're stated in existing code comments a few times, and that making a
case for how the expected ordering could be violated would be enough,
but I'm happy to draw up a side-by-side example.

>
> >
> > Use task_call_func() in __rdtgroup_move_task() to serialize updates to
> > the closid and rmid fields in the task_struct with context switch.
>
> Is there a reason why there is a switch between the all caps CLOSID/RMID
> at the beginning to the no caps here?

It's because I referred to the task_struct fields explicitly here.

-Peter