Re: [PATCH v2 0/5] Switch arm64 over to qrwlock

From: Waiman Long
Date: Mon Oct 09 2017 - 17:19:54 EST


On 10/06/2017 09:34 AM, Will Deacon wrote:
> Hi all,
>
> This is version two of the patches I posted yesterday:
>
> http://lists.infradead.org/pipermail/linux-arm-kernel/2017-October/534666.html
>
> I'd normally leave it longer before posting again, but Peter had a good
> suggestion to rework the layout of the lock word, so I wanted to post a
> version that follows that approach.
>
> I've updated my branch if you're after the full patch stack:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git qrwlock
>
> As before, all comments (particularly related to testing and performance)
> welcome!
>
> Cheers,
>
> Will
>
> --->8
>
> Will Deacon (5):
> kernel/locking: Use struct qrwlock instead of struct __qrwlock
> locking/atomic: Add atomic_cond_read_acquire
> kernel/locking: Use atomic_cond_read_acquire when spinning in qrwlock
> arm64: locking: Move rwlock implementation over to qrwlocks
> kernel/locking: Prevent slowpath writers getting held up by fastpath
>
> arch/arm64/Kconfig | 17 ++++
> arch/arm64/include/asm/Kbuild | 1 +
> arch/arm64/include/asm/spinlock.h | 164 +-------------------------------
> arch/arm64/include/asm/spinlock_types.h | 6 +-
> include/asm-generic/atomic-long.h | 3 +
> include/asm-generic/qrwlock.h | 20 +---
> include/asm-generic/qrwlock_types.h | 15 ++-
> include/linux/atomic.h | 4 +
> kernel/locking/qrwlock.c | 83 +++-------------
> 9 files changed, 58 insertions(+), 255 deletions(-)
>
I had done some performance test of your patch on a 1 socket Cavium
CN8880 system with 32 cores. I used my locking stress test which
produced the following results with 16 locking threads at various mixes
of reader & writer threads on 4.14-rc4 based kernels. The numbers are
the minimum/average/maximum locking operations done per locking threads
in a 10 seconds period. A minimum number of 1 means there is at least 1
thread that cannot acquire the lock during the test period.

w/o qrwlock patch with qrwlock patch
----------------- ------------------
16 readers 793,024/1,169,763/1,684,751 1,060,127/1,198,583/1,331,003

12 readers 1,162,760/1,641,714/2,162,939 1,685,334/2,099,088/2,338,461
4 writers 1/ 1/ 1 25,540/ 195,975/ 392,232

8 readers 2,135,670/2,391,612/2,737,564 2,985,686/3,359,048/3,870,423
8 writers 1/ 19,867/ 88,173 119,078/ 559,604/1,112,769

4 readers 1,194,917/1,250,876/1,299,304 3,611,059/4,653,775/6,268,370
12 writers 176,156/1,088,513/2,594,534 7,664/ 795,393/1,841,961

16 writers 35,007/1,094,608/1,954,457 1,618,915/1,633,077/1,645,637

It can be seen that qrwlock performed much better than the original rwlock
implementation.

Tested-by: Waiman Long <longman@xxxxxxxxxx>

Cheers,
Longman