Re: [PATCH] psi: fix aggregation idle shut-off

From: Johannes Weiner
Date: Mon Jan 28 2019 - 17:38:49 EST


On Mon, Jan 28, 2019 at 02:03:36PM -0800, Andrew Morton wrote:
> On Wed, 16 Jan 2019 14:35:01 -0500 Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
> > +/**
> > + * wq_worker_last_func - retrieve worker's last work function
> > + *
> > + * Determine the last function a worker executed. This is called from
> > + * the scheduler to get a worker's last known identity.
> > + *
> > + * CONTEXT:
> > + * spin_lock_irq(rq->lock)
> > + *
> > + * Return:
> > + * The last work function %current executed as a worker, NULL if it
> > + * hasn't executed any work yet.
> > + */
> > +work_func_t wq_worker_last_func(struct task_struct *task)
> > +{
> > + struct worker *worker = kthread_data(task);
> > +
> > + return worker->last_func;
> > +}
>
> The semantics are troublesome. What guarantees that worker->last_func
> won't change under the caller's feet? The caller should hold some lock
> (presumably worker->pool->lock) in order to stabilize the
> wq_worker_last_func() return value?
>
> Also, the comment isn't really true - this is called from PSI, which is
> hardly "the scheduler"?

psi isn't scheduler core, but it only works from a scheduler context.

The psi task change hook already requires being called under the
rq->lock while the task in question cannot change its scheduling
state, to record it and start the clock on the new state.

That same context ensures wq_worker_last_func() is safe. We're using
it from where the worker is inside the scheduler and going to *sleep*
(not just preemption), so couldn't possibly be changing last_func.

So IMO it makes sense to keep this last_func thing a part of the
private API between scheduler context and the workqueue code.