Re: [RESEND PATCH v3] locking/pvqspinlock: Add lock holder CPU argument to pv_wait()

From: Boqun Feng
Date: Thu Apr 28 2016 - 09:29:14 EST


Hi Waiman,

On Thu, Apr 21, 2016 at 11:02:05AM -0400, Waiman Long wrote:
> Pan Xinhui was asking for a lock holder cpu argument in pv_wait()
> to help the porting of pvqspinlock to PPC. The new argument will can
> help hypervisor expediate the execution of the critical section by
> the lock holder, if its vCPU isn't running, so that it can release
> the lock sooner.
>
> The pv_wait() function is now extended to have a third argument
> that holds the vCPU number of the lock holder, if this is
> known. Architecture specific hypervisor code can then make use of
> that information to make better decision of which vCPU should be
> running next.
>
> This patch also adds a new field in the pv_node structure to hold
> the vCPU number of the previous queue entry. For the queue head,
> the prev_cpu entry is likely to be the that of the lock holder,
> though lock stealing may render this information inaccurate.
>
> In pv_wait_head_or_lock(), pv_wait() will now be called with that
> vCPU number as it is likely to be the lock holder.
>
> In pv_wait_node(), the newly added pv_lookup_hash() function will
> be called to look up the queue head and pass in the lock holder vCPU
> number stored there, if found.
>
> Suggested-by: Pan Xinhui <xinhui@xxxxxxxxxxxxxxxxxx>
> Signed-off-by: Waiman Long <Waiman.Long@xxxxxxx>
> ---
>
> v2->v3:
> - Rephrase the changelog and some of the comments to make the
> intention of this patch more clear.
> - Make pv_lookup_hash() architecture dependent to eliminate its cost
> if an architecture doesn't need it. Also make the number of
> cachelines searched in pv_lookup_hash() to between 1-4.
> - [RESEND] Fix typo error.
>
> v1->v2:
> - In pv_wait_node(), lookup the hash table for the queue head and pass
> the lock holder cpu number to pv_wait().
>

[snip]

> @@ -282,7 +328,8 @@ static void pv_init_node(struct mcs_spinlock *node)
> * pv_kick_node() is used to set _Q_SLOW_VAL and fill in hash table on its
> * behalf.
> */
> -static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev)
> +static void pv_wait_node(struct qspinlock *lock, struct mcs_spinlock *node,
> + struct mcs_spinlock *prev)
> {
> struct pv_node *pn = (struct pv_node *)node;
> struct pv_node *pp = (struct pv_node *)prev;
> @@ -290,6 +337,8 @@ static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev)
> int loop;
> bool wait_early;
>
> + pn->prev_cpu = pp->cpu; /* Save vCPU # of previous queue entry */
> +
> /* waitcnt processing will be compiled out if !QUEUED_LOCK_STAT */
> for (;; waitcnt++) {
> for (wait_early = false, loop = SPIN_THRESHOLD; loop; loop--) {
> @@ -314,10 +363,23 @@ static void pv_wait_node(struct mcs_spinlock *node, struct mcs_spinlock *prev)
> smp_store_mb(pn->state, vcpu_halted);
>
> if (!READ_ONCE(node->locked)) {
> + struct pv_node *ph;
> +
> qstat_inc(qstat_pv_wait_node, true);
> qstat_inc(qstat_pv_wait_again, waitcnt);
> qstat_inc(qstat_pv_wait_early, wait_early);
> - pv_wait(&pn->state, vcpu_halted);
> +
> + /*
> + * If the current queue head is in the hash table,
> + * the prev_cpu field of its pv_node may contain the
> + * vCPU # of the lock holder. However, lock stealing
> + * may make that information inaccurate. Anyway, we
> + * look up the hash table to try to get the lock
> + * holder vCPU number.
> + */
> + ph = pv_lookup_hash(lock);

I am thinking, could we save the hash lookup here if wait_early == true?
Because in that case, the previous node is likely to get the lock soon,
it makes sense we yield to that node rather than the holder, so may we
can do something like:

if (wait_early)
pv_wait(&pn->state, vcpu_halted, pn->prev_cpu);
else {
ph = pv_lookup_hash(lock);
pv_wait(&pn->state, vcpu_halted,
ph ? ph->prev_cpu : -1);
}

Thoughts?

Regards,
Boqun

> + pv_wait(&pn->state, vcpu_halted,
> + ph ? ph->prev_cpu : -1);
> }
>
> /*

[snip]

Attachment: signature.asc
Description: PGP signature