Re: [PATCH] nvme: add cond_resched() to nvme_complete_batch()

From: Jiwei Sun
Date: Tue May 16 2023 - 10:18:58 EST


Hi Keith,

On 2023/5/16 04:40, Keith Busch wrote:
On Tue, May 02, 2023 at 08:54:12PM +0800, jiweisun126@xxxxxxx wrote:
From: Jiwei Sun <sunjw10@xxxxxxxxxx>

A soft lockup issue will be triggered when do fio test on a 448-core
server, such as the following warning:
...

According to the above two logs, we can know the nvme_irq() cost too much
time, in the above case, about 4.8 second. And we can also know that the
main bottlenecks is in the competition for the spin lock pool->lock.
The most recent 6.4-rc has included a significant changeset to the pool
allocator that may show a considerable difference in pool->lock timing.
It would be interesting to hear if it changes your observation with your
448-core setup. Would you be able to re-run your experiements that
produced the soft lockup with this kernel on that machine?
We have done some testes with the latest kernel, the issue can not be reproduced,
and we have analyzed the ftrace log of nvme_irq, we did NOT find any competition for
the spin lock pool->lock, and all the dma_pool_free function completed within 2us.

 287)               |        dma_pool_free() {
 287)   0.150 us    |          _raw_spin_lock_irqsave();
 287)   0.421 us    |          _raw_spin_unlock_irqrestore();
 287)   1.472 us    |        }
+-- 63 lines: 287)               |        mempool_free() {-----------
 435)               |        dma_pool_free() {
 435)   0.170 us    |          _raw_spin_lock_irqsave();
 435)   0.210 us    |          _raw_spin_unlock_irqrestore();
 435)   1.172 us    |        }
+--145 lines: 435)               |        mempool_free() {---------
 317)               |        dma_pool_free() {
 317)   0.160 us    |          _raw_spin_lock_irqsave();
 317)   0.401 us    |          _raw_spin_unlock_irqrestore();
 317)   1.252 us    |        }

Based on the test results and analysis of the code principles, your patch has fixed this performance issue.

By the way, another task hung issue was triggered in the test. We are analyzing it, but this is another story,
we can discuss it in other thread.

Thanks,
Regards,
Jiwei