Re: [PATCH for-next v7 0/7] On-Demand Paging on SoftRoCE

From: Zhu Yanjun
Date: Thu Dec 14 2023 - 21:46:26 EST



在 2023/12/14 13:55, Daisuke Matsuda (Fujitsu) 写道:
On Wed, Dec 13, 2023 3:08 AM Zhu Yanjun wrote:
在 2023/12/7 14:37, Daisuke Matsuda (Fujitsu) 写道:
On Tue, Dec 5, 2023 10:51 AM Zhu Yanjun wrote:
在 2023/12/5 8:11, Jason Gunthorpe 写道:
On Thu, Nov 09, 2023 at 02:44:45PM +0900, Daisuke Matsuda wrote:
Daisuke Matsuda (7):
RDMA/rxe: Always defer tasks on responder and completer to workqueue
RDMA/rxe: Make MR functions accessible from other rxe source code
RDMA/rxe: Move resp_states definition to rxe_verbs.h
RDMA/rxe: Add page invalidation support
RDMA/rxe: Allow registering MRs for On-Demand Paging
RDMA/rxe: Add support for Send/Recv/Write/Read with ODP
RDMA/rxe: Add support for the traditional Atomic operations with ODP
What is the current situation with rxe? I don't recall seeing the bugs
that were reported get fixed?
Well, I suppose Jason is mentioning "blktests srp/002 hang".
cf. https://lore.kernel.org/linux-rdma/dsg6rd66tyiei32zaxs6ddv5ebefr5vtxjwz6d2ewqrcwisogl@ge7jzan7dg5u/T/

It is likely to be a timing issue. Bob reported that "siw hangs with the debug kernel",
so the hang looks not specific to rxe.
cf. https://lore.kernel.org/all/53ede78a-f73d-44cd-a555-f8ff36bd9c55@xxxxxxx/T/
I think we need to decide whether to continue to block patches to rxe since nobody has successfully fixed the issue.


There is another issue that causes kernel panic.
[bug report][bisected] rdma_rxe: blktests srp lead kernel panic with 64k page size
cf. https://lore.kernel.org/all/CAHj4cs9XRqE25jyVw9rj9YugffLn5+f=1znaBEnu1usLOciD+g@xxxxxxxxxxxxxx/T/

https://patchwork.kernel.org/project/linux-rdma/list/?series=798592&state=*
Zhijian has submitted patches to fix this, and he got some comments.
It looks he is involved in CXL driver intensively these days.
I guess he is still working on it.

Exactly. A problem is reported in the link
https://www.spinics.net/lists/linux-rdma/msg120947.html

It seems that a variable 'entry' set but not used
[-Wunused-but-set-variable]
Yeah, I can revise the patch anytime.

And ODP is an important feature. Should we suggest to add a test case
about this ODP in rdma-core to verify this ODP feature?
Rxe can share the same tests with mlx5.
I added test cases for Write, Read and Atomic operations with ODP,
and we can add more tests if there are any suggestions.
Cf. https://github.com/linux-rdma/rdma-core/blob/master/tests/test_odp.py
Thanks a lot.
Do you make tests with blktests after your patches are applied with the
latest kernel?
I have not done that yet, but I agree I should do it.
I will try to take time for the test before submitting v8

Thanks. Hope blktest can work well with your commits.

Zhu Yanjun


Thanks,
Daisuke Matsuda


Zhu Yanjun

Thanks,
Daisuke Matsuda

Zhu Yanjun

I'm reluctant to dig a deeper hold until it is done?

Thanks,
Jason