Re: [PATCH for-next v5 6/7] RDMA/rxe: Add support for Send/Recv/Write/Read with ODP

From: Jason Gunthorpe
Date: Mon Jun 12 2023 - 12:22:52 EST


On Thu, May 18, 2023 at 05:21:51PM +0900, Daisuke Matsuda wrote:

> +/* umem mutex must be locked before entering this function. */
> +static int rxe_odp_map_range(struct rxe_mr *mr, u64 iova, int length, u32 flags)
> +{
> + struct ib_umem_odp *umem_odp = to_ib_umem_odp(mr->umem);
> + const int max_tries = 3;
> + int cnt = 0;
> +
> + int err;
> + u64 perm;
> + bool need_fault;
> +
> + if (unlikely(length < 1)) {
> + mutex_unlock(&umem_odp->umem_mutex);
> + return -EINVAL;
> + }
> +
> + perm = ODP_READ_ALLOWED_BIT;
> + if (!(flags & RXE_PAGEFAULT_RDONLY))
> + perm |= ODP_WRITE_ALLOWED_BIT;
> +
> + /*
> + * A successful return from rxe_odp_do_pagefault() does not guarantee
> + * that all pages in the range became present. Recheck the DMA address
> + * array, allowing max 3 tries for pagefault.
> + */
> + while ((need_fault = rxe_is_pagefault_neccesary(umem_odp,
> + iova, length, perm))) {
> + if (cnt >= max_tries)
> + break;

I don't think this makes sense..

You need to make this work more like mlx5 does, you take the spinlock
on the xarray, you search it for your index and whatever is there
tells what to do. Hold the spinlock while doing the copy. This is
enough locking for the fast path.

If there is no index present, or it is not writable and you need
write, then you unlock the spinlock and prefetch the missing entry and
try again, this time also holding the mutex so there isn't a race.

You shouldn't probe into parts of the umem to discover information
already stored in the xarray then do the same lookup into the xarray.

IIRC this also needs to keep track in the xarray on a per page basis
if the page is writable.

Jason