Re: [PATCH] sbitmap: Use single per-bitmap counting to wake up queued tags

From: Jan Kara
Date: Thu Nov 10 2022 - 10:35:42 EST


Hi!

On Thu 10-11-22 21:18:19, Yu Kuai wrote:
> 在 2022/11/10 19:16, Jan Kara 写道:
> > Hi!
> >
> > On Thu 10-11-22 17:42:49, Yu Kuai wrote:
> > > 在 2022/11/06 7:10, Gabriel Krisman Bertazi 写道:
> > > > +void sbitmap_queue_wake_up(struct sbitmap_queue *sbq, int nr)
> > > > {
> > > > - struct sbq_wait_state *ws;
> > > > - unsigned int wake_batch;
> > > > - int wait_cnt, cur, sub;
> > > > - bool ret;
> > > > + unsigned int wake_batch = READ_ONCE(sbq->wake_batch);
> > > > + struct sbq_wait_state *ws = NULL;
> > > > + unsigned int wakeups;
> > > > - if (*nr <= 0)
> > > > - return false;
> > > > + if (!atomic_read(&sbq->ws_active))
> > > > + return;
> > > > - ws = sbq_wake_ptr(sbq);
> > > > - if (!ws)
> > > > - return false;
> > > > + atomic_add(nr, &sbq->completion_cnt);
> > > > + wakeups = atomic_read(&sbq->wakeup_cnt);
> > > > - cur = atomic_read(&ws->wait_cnt);
> > > > do {
> > > > - /*
> > > > - * For concurrent callers of this, callers should call this
> > > > - * function again to wakeup a new batch on a different 'ws'.
> > > > - */
> > > > - if (cur == 0)
> > > > - return true;
> > > > - sub = min(*nr, cur);
> > > > - wait_cnt = cur - sub;
> > > > - } while (!atomic_try_cmpxchg(&ws->wait_cnt, &cur, wait_cnt));
> > > > -
> > > > - /*
> > > > - * If we decremented queue without waiters, retry to avoid lost
> > > > - * wakeups.
> > > > - */
> > > > - if (wait_cnt > 0)
> > > > - return !waitqueue_active(&ws->wait);
> > > > + if (atomic_read(&sbq->completion_cnt) - wakeups < wake_batch)
> > > > + return;
> > >
> > > Should it be considered that completion_cnt overflow and becomes
> > > negtive?
> >
> > Yes, the counters can (and will) certainly overflow but since we only care
> > about (completion_cnt - wakeups), we should be fine - this number is always
> > sane (and relatively small) and in the kernel we do compile with signed
> > overflows being well defined.
>
> I'm worried about this: for example, the extreme scenaro that there
> is only one tag, currently there are only one infight rq and one thread
> is waiting for tag. When the infight rq complete, if 'completion_cnt'
> overflow to negative, then 'atomic_read(&sbq->completion_cnt) - wakeups
> < wake_batch' will be passed unexpected, then will the thread never be
> woken up if there are no new io issued ?

Well but my point is that 'wakeups' is staying close to completion_cnt. So
if completion_cnt wraps to INT_MIN, then 'wakeups' is close to INT_MAX and
so completion_cnt - wakeups is going to wrap back and still result in a
small number. That is simply how wrapping arithmetics works...

Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR