Re: [PATCH v11 0/4] Introducing a queue read/write lockimplementation

From: Davidlohr Bueso
Date: Fri Jan 31 2014 - 20:30:42 EST


On Fri, 2014-01-31 at 16:09 -0500, Waiman Long wrote:
> On 01/31/2014 03:14 PM, Peter Zijlstra wrote:
> > On Fri, Jan 31, 2014 at 01:59:02PM -0500, Waiman Long wrote:
> >> On 01/31/2014 04:26 AM, Peter Zijlstra wrote:
> >>> On Thu, Jan 30, 2014 at 04:17:15PM +0100, Peter Zijlstra wrote:
> >>>> The below is still small and actually works.
> >>> OK, so having actually worked through the thing; I realized we can
> >>> actually do a version without MCS lock and instead use a ticket lock for
> >>> the waitqueue.
> >>>
> >>> This is both smaller (back to 8 bytes for the rwlock_t), and should be
> >>> faster under moderate contention for not having to touch extra
> >>> cachelines.
> >>>
> >>> Completely untested and with a rather crude generic ticket lock
> >>> implementation to illustrate the concept:
> >>>
> >> Using a ticket lock instead will have the same scalability problem as the
> >> ticket spinlock as all the waiting threads will spin on the lock cacheline
> >> causing a lot of cache bouncing traffic.
> > A much more important point for me is that a fair rwlock has a _much_
> > better worst case behaviour than the current mess. That's the reason I
> > was interested in the qrwlock thing. Not because it can run contended on
> > a 128 CPU system and be faster at being contended.
> >
> > If you contend a lock with 128 CPUs you need to go fix that code that
> > causes this abysmal behaviour in the first place.
> >

But the kernel should also be prepared for such situations, whenever
possible.

> >
>
> I am not against the use of ticket spinlock as the queuing mechanism on
> small systems. I do have concern about the contended performance on
> large NUMA systems which is my primary job responsibility. Depending on
> the workload, contention can happens anywhere. So it is easier said than
> done to fix whatever lock contention that may happen.
>
> How about making the selection of MCS or ticket queuing either user
> configurable or depending on the setting of NR_CPUS, NUMA, etc?

Users have no business making these decisions and being exposed to these
kind of internals. CONFIG_NUMA sounds reasonable to me.

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/