Re: [PATCH rdma-next] RDMA/mlx5: Avoid taking MRs from larger MR cache pools when a pool is empty

From: Jason Gunthorpe
Date: Mon Oct 04 2021 - 19:00:10 EST


On Sun, Sep 26, 2021 at 11:31:43AM +0300, Leon Romanovsky wrote:
> From: Aharon Landau <aharonl@xxxxxxxxxx>
>
> Currently, if a cache entry is empty, the driver will try to take MRs
> from larger cache entries. This behavior consumes a lot of memory.
> In addition, when searching for an mkey in an entry, the entry is locked.
> When using a multithreaded application with the old behavior, the threads
> will block each other more often, which can hurt performance as can be
> seen in the table below.
>
> Therefore, avoid it by creating a new mkey when the requested cache entry
> is empty.
>
> The test was performed on a machine with
> Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz 44 cores.
>
> Here are the time measures for allocating MRs of 2^6 pages. The search in
> the cache started from entry 6.
>
> +------------+---------------------+---------------------+
> | | Old behavior | New behavior |
> | +----------+----------+----------+----------+
> | | 1 thread | 5 thread | 1 thread | 5 thread |
> +============+==========+==========+==========+==========+
> | 1,000 MRs | 14 ms | 30 ms | 14 ms | 80 ms |
> +------------+----------+----------+----------+----------+
> | 10,000 MRs | 135 ms | 6 sec | 173 ms | 880 ms |
> +------------+----------+----------+----------+----------+
> |100,000 MRs | 11.2 sec | 57 sec | 1.74 sec | 8.8 sec |
> +------------+----------+----------+----------+----------+
>
> Signed-off-by: Aharon Landau <aharonl@xxxxxxxxxx>
> Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxx>
> ---
> drivers/infiniband/hw/mlx5/mr.c | 26 +++++++++-----------------
> 1 file changed, 9 insertions(+), 17 deletions(-)

I'm surprised the cost is so high, I assume this has alot to do with
repeated calls to queue_adjust_cache_locked()? Maybe this should be
further investigated?

Anyhow, applied to for-next, thanks

Jason