Re: [smp] a32a4d8a81: netperf.Throughput_tps -2.1% regression

From: Nadav Amit
Date: Wed May 19 2021 - 15:31:24 EST




> On May 19, 2021, at 11:38 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Wed, May 19, 2021 at 06:17:35PM +0000, Nadav Amit wrote:
>>> 1287 ± 42% +75.3% 2256 ± 14% interrupts.CPU111.CAL:Function_call_interrupts
>>> 1326 ± 43% +71.0% 2267 ± 13% interrupts.CPU119.CAL:Function_call_interrupts
>>> 1300 ± 45% +75.9% 2287 ± 37% interrupts.CPU120.CAL:Function_call_interrupts
>>> 1299 ± 45% +60.1% 2081 ± 28% interrupts.CPU128.CAL:Function_call_interrupts
>>> 1305 ± 45% +61.7% 2110 ± 29% interrupts.CPU131.CAL:Function_call_interrupts
>>> 1299 ± 45% +61.8% 2102 ± 28% interrupts.CPU139.CAL:Function_call_interrupts
>>> 66.67 ±133% -97.2% 1.83 ±155% interrupts.CPU14.TLB:TLB_shootdowns
>>> 1299 ± 45% +107.8% 2700 ± 33% interrupts.CPU142.CAL:Function_call_interrupts
>>> 301.83 ±128% -95.6% 13.17 ±140% interrupts.CPU149.RES:Rescheduling_interrupts
>>> 389.17 ± 89% -73.5% 103.17 ± 35% interrupts.CPU164.NMI:Non-maskable_interrupts
>>> 389.17 ± 89% -73.5% 103.17 ± 35% interrupts.CPU164.PMI:Performance_monitoring_interrupts
>>> 1299 ± 45% +60.2% 2081 ± 28% interrupts.CPU35.CAL:Function_call_interrupts
>>> 1244 ± 50% +66.8% 2076 ± 27% interrupts.CPU45.CAL:Function_call_interrupts
>>> 1300 ± 44% +59.5% 2075 ± 28% interrupts.CPU46.CAL:Function_call_interrupts
>>> 1.50 ± 63% +1422.2% 22.83 ±167% interrupts.CPU47.RES:Rescheduling_interrupts
>>> 467.33 ± 85% -64.6% 165.67 ± 74% interrupts.CPU58.NMI:Non-maskable_interrupts
>>> 467.33 ± 85% -64.6% 165.67 ± 74% interrupts.CPU58.PMI:Performance_monitoring_interrupts
>>> 306.67 ± 75% -59.9% 122.83 ± 16% interrupts.CPU68.NMI:Non-maskable_interrupts
>>> 306.67 ± 75% -59.9% 122.83 ± 16% interrupts.CPU68.PMI:Performance_monitoring_interrupts
>>> 1131 ± 27% +61.2% 1822 ± 35% interrupts.CPU85.CAL:Function_call_interrupts
>>> 1180 ± 31% +79.6% 2119 ± 24% interrupts.CPU86.CAL:Function_call_interrupts
>>>
>
> It looks to be sending *waay* more call IPIs, did we mess up the mask or
> loose an optimization somewhere?
>
> I'll go read the commit again…

As you know, I did mess up by calling arch_send_call_function_single_ipi()
instead of smp_call_function_single(), which could explain the extra IPIs.
But that was resolved by your subsequent patch.

For me, what stands out is the time in C1 spent after the patch.

I will try to reproduce the issue to figure it out, since so far I could
not find an error in the code.

Attachment: signature.asc
Description: Message signed with OpenPGP