Re: [PATCH v4 1/2] x86/resctrl: Update task closid/rmid with task_call_func()

From: Peter Newman
Date: Wed Dec 14 2022 - 05:06:16 EST


Hi Reinette,

On Tue, Dec 13, 2022 at 7:34 PM Reinette Chatre
<reinette.chatre@xxxxxxxxx> wrote:
> On 12/12/2022 9:36 AM, Peter Newman wrote:
> > On Sat, Dec 10, 2022 at 12:54 AM Reinette Chatre <reinette.chatre@xxxxxxxxx> wrote:
> >> On 12/8/2022 2:30 PM, Peter Newman wrote:
> >>> Based on this, I'll just sketch out the first scenario below and drop
> >>> (2) from the changelog. This also implies that the group update cases
> >>
> >> ok, thank you for doing that analysis.
> >>
> >>> can use a single smp_mb() to provide all the necessary ordering, because
> >>> there's a full barrier on context switch for it to pair with, so I don't
> >>> need to broadcast IPI anymore. I don't know whether task_call_func() is
> >>
> >> This is not clear to me because rdt_move_group_tasks() seems to have the
> >> same code as shown below as vulnerable to re-ordering. Only difference
> >> is that it uses the "//false" checks to set a bit in the cpumask for a
> >> later IPI instead of an immediate IPI.
> >
> > An smp_mb() between writing the new task_struct::{closid,rmid} and
> > calling task_curr() would prevent the reordering I described, but I
> > was worried about the cost of executing a full barrier for every
> > matching task.
>
> So for moving a single task the solution may just be to change
> the current barrier() to smp_mb()?

Yes, that's a simpler change, so I'll just do that instead.

>
> >
> > I tried something like this:
> >
> > for_each_process_thread(p, t) {
> > if (!from || is_closid_match(t, from) ||
> > is_rmid_match(t, from)) {
> > WRITE_ONCE(t->closid, to->closid);
> > WRITE_ONCE(t->rmid, to->mon.rmid);
> > /* group moves are serialized by rdt */
> > t->resctrl_dirty = true;
> > }
> > }
> > if (IS_ENABLED(CONFIG_SMP) && mask) {
> > /* Order t->{closid,rmid} stores before loads in task_curr() */
> > smp_mb();
> > for_each_process_thread(p, t) {
> > if (t->resctrl_dirty) {
> > if (task_curr(t))
> > cpumask_set_cpu(task_cpu(t), mask);
> > t->resctrl_dirty = false;
> > }
> > }
> > }
> >
>
> struct task_struct would not welcome a new member dedicated to resctrl's
> rare use for convenience. Another option may be to use a flag within
> the variables themselves but that seems like significant complication
> (flag need to be dealt with during scheduling) for which the benefit
> is not clear to me. I would prefer that we start with the simplest
> solution first (I see that as IPI to all CPUs). The changelog needs clear
> description of the problem needing to be solved and the solution chosen, noting
> the tradeoffs with other possible solutions. You can submit that, as an RFC
> if the "IPI to all CPUs" remains a concern, after which we can bring that
> submission to the attention of the experts who would have needed information then
> to point us in the right direction.

To be complete, I did the benchmark again with the simple addition of
an smp_mb() on every iteration with a matching CLOSID/RMID and found
that it didn't result in a substantial performance impact. (1.57ms
-> 1.65ms).

This isn't as significant as the 2x slowdown I saw when using
task_call_func(), so maybe task_call_func() is just really expensive.
That's more reason to just upgrade the barrier in the single-task move
case.

While I agree with your points on the IPI broadcast, it seems like a
discussion I would prefer to just avoid given these measurements.

-Peter