Re: [PATCH-tip 00/22] locking/rwsem: Rework rwsem-xadd & enable new rwsem features

From: Waiman Long
Date: Thu Feb 14 2019 - 10:22:24 EST


On 02/14/2019 08:23 AM, Davidlohr Bueso wrote:
> On Fri, 08 Feb 2019, Waiman Long wrote:
>> I am planning to run more performance test and post the data sometimes
>> next week. Davidlohr is also going to run some of his rwsem performance
>> test on this patchset.
>
> So I ran this series on a 40-core IB 2 socket with various worklods in
> mmtests. Below are some of the interesting ones; full numbers and curves
> at https://linux-scalability.org/rwsem-reader-spinner/
>
> All workloads are with increasing number of threads.
>
> -- pagefault timings: pft is an artificial pf benchmark (thus reader
> stress).
> metric is faults/cpu and faults/sec
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ v5.0-rc6ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ v5.0-rc6
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ dirty
> HmeanÂÂÂÂ faults/cpu-1ÂÂÂ 624224.9815 (ÂÂ 0.00%)ÂÂ 618847.5201 *Â -0.86%*
> HmeanÂÂÂÂ faults/cpu-4ÂÂÂ 539550.3509 (ÂÂ 0.00%)ÂÂ 547407.5738 *ÂÂ 1.46%*
> HmeanÂÂÂÂ faults/cpu-7ÂÂÂ 401470.3461 (ÂÂ 0.00%)ÂÂ 381157.9830 *Â -5.06%*
> HmeanÂÂÂÂ faults/cpu-12ÂÂ 267617.0353 (ÂÂ 0.00%)ÂÂ 271098.5441 *ÂÂ 1.30%*
> HmeanÂÂÂÂ faults/cpu-21ÂÂ 176194.4641 (ÂÂ 0.00%)ÂÂ 175151.3256 *Â -0.59%*
> HmeanÂÂÂÂ faults/cpu-30ÂÂ 119927.3862 (ÂÂ 0.00%)ÂÂ 120610.1348 *ÂÂ 0.57%*
> HmeanÂÂÂÂ faults/cpu-40ÂÂÂ 91203.6820 (ÂÂ 0.00%)ÂÂÂ 91832.7489 *ÂÂ 0.69%*
> HmeanÂÂÂÂ faults/sec-1ÂÂÂ 623292.3467 (ÂÂ 0.00%)ÂÂ 617992.0795 *Â -0.85%*
> HmeanÂÂÂÂ faults/sec-4ÂÂ 2113364.6898 (ÂÂ 0.00%)Â 2140254.8238 *ÂÂ 1.27%*
> HmeanÂÂÂÂ faults/sec-7ÂÂ 2557378.4385 (ÂÂ 0.00%)Â 2450945.7060 *Â -4.16%*
> HmeanÂÂÂÂ faults/sec-12Â 2696509.8975 (ÂÂ 0.00%)Â 2747968.9819 *ÂÂ 1.91%*
> HmeanÂÂÂÂ faults/sec-21Â 2902892.5639 (ÂÂ 0.00%)Â 2905923.3881 *ÂÂ 0.10%*
> HmeanÂÂÂÂ faults/sec-30Â 2956696.5793 (ÂÂ 0.00%)Â 2990583.5147 *ÂÂ 1.15%*
> HmeanÂÂÂÂ faults/sec-40Â 3422806.4806 (ÂÂ 0.00%)Â 3352970.3082 *Â -2.04%*
> StddevÂÂÂ faults/cpu-1ÂÂÂÂÂ 2949.5159 (ÂÂ 0.00%)ÂÂÂÂ 2802.2712 (ÂÂ 4.99%)
> StddevÂÂÂ faults/cpu-4ÂÂÂÂ 24165.9454 (ÂÂ 0.00%)ÂÂÂ 15841.1232 (Â 34.45%)
> StddevÂÂÂ faults/cpu-7ÂÂÂÂ 20914.8351 (ÂÂ 0.00%)ÂÂÂ 22744.3294 (Â -8.75%)
> StddevÂÂÂ faults/cpu-12ÂÂÂ 11274.3490 (ÂÂ 0.00%)ÂÂÂ 14733.3152 ( -30.68%)
> StddevÂÂÂ faults/cpu-21ÂÂÂÂ 2500.1950 (ÂÂ 0.00%)ÂÂÂÂ 2200.9518 (Â 11.97%)
> StddevÂÂÂ faults/cpu-30ÂÂÂÂ 1599.5346 (ÂÂ 0.00%)ÂÂÂÂ 1414.0339 (Â 11.60%)
> StddevÂÂÂ faults/cpu-40ÂÂÂÂ 1473.0181 (ÂÂ 0.00%)ÂÂÂÂ 3004.1209 (-103.94%)
> StddevÂÂÂ faults/sec-1ÂÂÂÂÂ 2655.2581 (ÂÂ 0.00%)ÂÂÂÂ 2405.1625 (ÂÂ 9.42%)
> StddevÂÂÂ faults/sec-4ÂÂÂÂ 84042.7234 (ÂÂ 0.00%)ÂÂÂ 57996.7158 (Â 30.99%)
> StddevÂÂÂ faults/sec-7ÂÂÂ 123656.7901 (ÂÂ 0.00%)ÂÂ 135591.1087 (Â -9.65%)
> StddevÂÂÂ faults/sec-12ÂÂÂ 97135.6091 (ÂÂ 0.00%)ÂÂ 127054.4926 ( -30.80%)
> StddevÂÂÂ faults/sec-21ÂÂÂ 69564.6264 (ÂÂ 0.00%)ÂÂÂ 65922.6381 (ÂÂ 5.24%)
> StddevÂÂÂ faults/sec-30ÂÂÂ 51524.4027 (ÂÂ 0.00%)ÂÂÂ 56109.4159 (Â -8.90%)
> StddevÂÂÂ faults/sec-40ÂÂ 101927.5280 (ÂÂ 0.00%)ÂÂ 160117.0093 ( -57.09%)
>
> With the exception of the hicup at 7 threads, things are pretty much in
> the noise region for both metrics.
>
> -- git checkout
>
> First metric is total runtime for runs with incremental threads.
>
> ÂÂÂÂÂÂÂÂÂ v5.0-rc6ÂÂÂ v5.0-rc6
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ dirty
> UserÂÂÂÂÂÂÂÂ 218.95ÂÂÂÂÂ 219.07
> SystemÂÂÂÂÂÂ 149.29ÂÂÂÂÂ 146.82
> ElapsedÂÂÂÂ 1574.10ÂÂÂÂ 1427.08
>
> In this case there's a non trivial improvement (11%) in overall
> elapsed time.
>
> -- reaim (which is always succeptible to rwsem changes for both
> mmap_sem and
> i_mmap)
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ v5.0-rc6ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ v5.0-rc6
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ dirty
> HmeanÂÂÂÂ compute-1ÂÂÂÂÂÂÂÂ 6674.01 (ÂÂ 0.00%)ÂÂÂÂ 6544.28 *Â -1.94%*
> HmeanÂÂÂÂ compute-21ÂÂÂÂÂÂ 85294.91 (ÂÂ 0.00%)ÂÂÂ 85524.20 *ÂÂ 0.27%*
> HmeanÂÂÂÂ compute-41ÂÂÂÂÂ 149674.70 (ÂÂ 0.00%)ÂÂ 149494.58 *Â -0.12%*
> HmeanÂÂÂÂ compute-61ÂÂÂÂÂ 177721.15 (ÂÂ 0.00%)ÂÂ 170507.38 *Â -4.06%*
> HmeanÂÂÂÂ compute-81ÂÂÂÂÂ 181531.07 (ÂÂ 0.00%)ÂÂ 180463.24 *Â -0.59%*
> HmeanÂÂÂÂ compute-101ÂÂÂÂ 189024.09 (ÂÂ 0.00%)ÂÂ 187288.86 *Â -0.92%*
> HmeanÂÂÂÂ compute-121ÂÂÂÂ 200673.24 (ÂÂ 0.00%)ÂÂ 195327.65 *Â -2.66%*
> HmeanÂÂÂÂ compute-141ÂÂÂÂ 213082.29 (ÂÂ 0.00%)ÂÂ 211290.80 *Â -0.84%*
> HmeanÂÂÂÂ compute-161ÂÂÂÂ 207764.06 (ÂÂ 0.00%)ÂÂ 204626.68 *Â -1.51%*
>
> The 'compute' workload overall takes a small hit.
>
> HmeanÂÂÂÂ new_dbase-1ÂÂÂÂÂÂÂÂ 60.48 (ÂÂ 0.00%)ÂÂÂÂÂÂ 60.63 *ÂÂ 0.25%*
> HmeanÂÂÂÂ new_dbase-21ÂÂÂÂÂ 6590.49 (ÂÂ 0.00%)ÂÂÂÂ 6671.81 *ÂÂ 1.23%*
> HmeanÂÂÂÂ new_dbase-41ÂÂÂÂ 14202.91 (ÂÂ 0.00%)ÂÂÂ 14470.59 *ÂÂ 1.88%*
> HmeanÂÂÂÂ new_dbase-61ÂÂÂÂ 21207.24 (ÂÂ 0.00%)ÂÂÂ 21067.40 *Â -0.66%*
> HmeanÂÂÂÂ new_dbase-81ÂÂÂÂ 25542.40 (ÂÂ 0.00%)ÂÂÂ 25542.40 *ÂÂ 0.00%*
> HmeanÂÂÂÂ new_dbase-101ÂÂÂ 30165.28 (ÂÂ 0.00%)ÂÂÂ 30046.21 *Â -0.39%*
> HmeanÂÂÂÂ new_dbase-121ÂÂÂ 33638.33 (ÂÂ 0.00%)ÂÂÂ 33219.90 *Â -1.24%*
> HmeanÂÂÂÂ new_dbase-141ÂÂÂ 36723.70 (ÂÂ 0.00%)ÂÂÂ 37504.52 *ÂÂ 2.13%*
> HmeanÂÂÂÂ new_dbase-161ÂÂÂ 42242.51 (ÂÂ 0.00%)ÂÂÂ 42117.34 *Â -0.30%*
> HmeanÂÂÂÂ shared-1ÂÂÂÂÂÂÂÂÂÂÂ 76.54 (ÂÂ 0.00%)ÂÂÂÂÂÂ 76.09 *Â -0.59%*
> HmeanÂÂÂÂ shared-21ÂÂÂÂÂÂÂÂ 7535.51 (ÂÂ 0.00%)ÂÂÂÂ 5518.75 * -26.76%*
> HmeanÂÂÂÂ shared-41ÂÂÂÂÂÂÂ 17207.81 (ÂÂ 0.00%)ÂÂÂ 14651.94 * -14.85%*
> HmeanÂÂÂÂ shared-61ÂÂÂÂÂÂÂ 20716.98 (ÂÂ 0.00%)ÂÂÂ 18667.52 *Â -9.89%*
> HmeanÂÂÂÂ shared-81ÂÂÂÂÂÂÂ 27603.83 (ÂÂ 0.00%)ÂÂÂ 23466.45 * -14.99%*
> HmeanÂÂÂÂ shared-101ÂÂÂÂÂÂ 26008.59 (ÂÂ 0.00%)ÂÂÂ 29536.96 *Â 13.57%*
> HmeanÂÂÂÂ shared-121ÂÂÂÂÂÂ 28354.76 (ÂÂ 0.00%)ÂÂÂ 43139.39 *Â 52.14%*
> HmeanÂÂÂÂ shared-141ÂÂÂÂÂÂ 38509.25 (ÂÂ 0.00%)ÂÂÂ 41619.35 *ÂÂ 8.08%*
> HmeanÂÂÂÂ shared-161ÂÂÂÂÂÂ 40496.07 (ÂÂ 0.00%)ÂÂÂ 44303.46 *ÂÂ 9.40%*
>
> Overall there is a small hit (in the noise level but consistent
> throughout
> many workloads), except git-checkout which does quite well.
>
> Thanks,
> Davidlohr

Thanks for running the patch through your performance tests.

Cheers,
Longman