RE: [RFC PATCH v2 2/7] RDMA/rxe: Convert the triple tasklets to workqueues

From: Daisuke Matsuda (Fujitsu)
Date: Fri Nov 18 2022 - 05:04:37 EST


On Fri, Nov 18, 2022 5:34 PM Hillf Danton wrote:
Hi Hillf,

Thank you for taking a look.

As I wrote in the cover letter, a large part of this patch shall be temporary,
and Bob Pearson's workqueue implementation is likely to be adopted instead
unless there are any problems with it.
[PATCH for-next v3 00/13] Implement work queues for rdma_rxe
Cf. https://lore.kernel.org/linux-rdma/20221029031009.64467-1-rpearsonhpe@xxxxxxxxx/

I appreciate your insightful comments. If his workqueue is rejected in the end,
then I will fix them for submission. Otherwise, I am going to rebase my work
onto his patches in the next version.

Thanks,
Daisuke

> On 11 Nov 2022 18:22:23 +0900 Daisuke Matsuda <matsuda-daisuke@xxxxxxxxxxx>
> > +/*
> > + * this locking is due to a potential race where
> > + * a second caller finds the work already running
> > + * but looks just after the last call to func
> > + */
> > +void rxe_do_work(struct work_struct *w)
> > +{
> > + int cont;
> > + int ret;
> > +
> > + struct rxe_work *work = container_of(w, typeof(*work), work);
> > + unsigned int iterations = RXE_MAX_ITERATIONS;
> > +
> > + spin_lock_bh(&work->state_lock);
> > + switch (work->state) {
> > + case WQ_STATE_START:
> > + work->state = WQ_STATE_BUSY;
> > + spin_unlock_bh(&work->state_lock);
> > + break;
> > +
> > + case WQ_STATE_BUSY:
> > + work->state = WQ_STATE_ARMED;
> > + fallthrough;
> > + case WQ_STATE_ARMED:
> > + spin_unlock_bh(&work->state_lock);
> > + return;
> > +
> > + default:
> > + spin_unlock_bh(&work->state_lock);
> > + pr_warn("%s failed with bad state %d\n", __func__, work->state);
> > + return;
> > + }
> > +
> > + do {
> > + cont = 0;
> > + ret = work->func(work->arg);
> > +
> > + spin_lock_bh(&work->state_lock);
> > + switch (work->state) {
> > + case WQ_STATE_BUSY:
> > + if (ret) {
> > + work->state = WQ_STATE_START;
> > + } else if (iterations--) {
> > + cont = 1;
> > + } else {
> > + /* reschedule the work and exit
> > + * the loop to give up the cpu
> > + */
>
> Unlike tasklet, workqueue work is unable to be a CPU hog with PREEMPT
> enabled, otherwise cond_resched() is enough.
>
> > + queue_work(work->worker, &work->work);
>
> Nit, s/worker/workq/ for example as worker, work and workqueue are
> different things in the domain of WQ.
>
> > + work->state = WQ_STATE_START;
> > + }
> > + break;
> > +
> > + /* someone tried to run the work since the last time we called
> > + * func, so we will call one more time regardless of the
> > + * return value
> > + */
> > + case WQ_STATE_ARMED:
> > + work->state = WQ_STATE_BUSY;
> > + cont = 1;
> > + break;
> > +
> > + default:
> > + pr_warn("%s failed with bad state %d\n", __func__,
> > + work->state);
> > + }
> > + spin_unlock_bh(&work->state_lock);
> > + } while (cont);
> > +
> > + work->ret = ret;
> > +}
> > +
> [...]
> > +void rxe_run_work(struct rxe_work *work, int sched)
> > +{
> > + if (work->destroyed)
> > + return;
> > +
> > + /* busy-loop while qp reset is in progress */
> > + while (atomic_read(&work->suspended))
> > + continue;
>
> Feel free to add a one-line comment specifying the reasons for busy loop
> instead of taking a nap, given it may take two seconds to flush WQ.
>
> > +
> > + if (sched)
> > + queue_work(work->worker, &work->work);
> > + else
> > + rxe_do_work(&work->work);
> > +}
> > +
> > +void rxe_disable_work(struct rxe_work *work)
> > +{
> > + atomic_inc(&work->suspended);
> > + flush_workqueue(work->worker);
> > +}