Re: [PATCH for-next 4/6] RDMA/hns: Support flexible pagesize

From: Jason Gunthorpe
Date: Fri Jan 05 2024 - 09:19:03 EST


On Fri, Jan 05, 2024 at 01:55:11PM +0800, Junxian Huang wrote:
>
>
> On 2024/1/5 4:29, Jason Gunthorpe wrote:
> > On Tue, Dec 26, 2023 at 05:16:33PM +0800, Junxian Huang wrote:
> >>
> >>
> >> On 2023/12/26 16:52, Leon Romanovsky wrote:
> >>> On Mon, Dec 25, 2023 at 03:53:28PM +0800, Junxian Huang wrote:
> >>>> From: Chengchang Tang <tangchengchang@xxxxxxxxxx>
> >>>>
> >>>> In the current implementation, a fixed page size is used to
> >>>> configure the PBL, which is not flexible enough and is not
> >>>> conducive to the performance of the HW.
> >>>>
> >>>> Signed-off-by: Chengchang Tang <tangchengchang@xxxxxxxxxx>
> >>>> Signed-off-by: Junxian Huang <huangjunxian6@xxxxxxxxxxxxx>
> >>>> ---
> >>>> drivers/infiniband/hw/hns/hns_roce_alloc.c | 6 -
> >>>> drivers/infiniband/hw/hns/hns_roce_device.h | 9 ++
> >>>> drivers/infiniband/hw/hns/hns_roce_mr.c | 168 +++++++++++++++-----
> >>>> 3 files changed, 139 insertions(+), 44 deletions(-)
> >>>
> >>> I'm wonder if the ib_umem_find_best_pgsz() API should be used instead.
> >>> What is missing there?
> >>>
> >>> Thanks
> >>
> >> Actually this API is used for umem.
> >> For kmem, we add hns_roce_find_buf_best_pgsz() to do a similar job.
> >
> > But why do you need to do something like this for kmem? It looked to
> > me like kmem knows its allocation size when it was allocated, how come
> > you need to iterate over all of it again?
> >
> > Jason
> >
>
> kmem was split into multiple small pages for allocation to prevent allocation
> failure due to memory fragmentation.
>
> And now we add this function to confirm whether these small pages have contiguous
> address. If so, they can be combined into one huge page for use, which is more
> likely when iommu/smmu is enabled.

That seems unncessary. The chances you get contiguous pages from
a lot of random allocations is really slim.

If you care about this optimization then you should have the allocator
explicitly request contiguous pages with high order allocations in a
manner that quickly fails and falls back to PAGE_SIZE.

Then you just use the size that the allocator was able to get, not try
to figure it out after the fact.

Jason