Re: [PATCH 03/13] scatterlist: Add sg_set_folio()

From: Zhu Yanjun
Date: Fri Aug 18 2023 - 03:06:24 EST



在 2023/7/31 5:42, Matthew Wilcox 写道:
On Sun, Jul 30, 2023 at 09:57:06PM +0800, Zhu Yanjun wrote:
在 2023/7/30 19:18, Matthew Wilcox 写道:
On Sun, Jul 30, 2023 at 07:01:26PM +0800, Zhu Yanjun wrote:
Does the following function have folio version?

"
int sg_alloc_append_table_from_pages(struct sg_append_table *sgt_append,
struct page **pages, unsigned int n_pages, unsigned int offset,
unsigned long size, unsigned int max_segment,
unsigned int left_pages, gfp_t gfp_mask)
"
No -- I haven't needed to convert anything that uses
sg_alloc_append_table_from_pages() yet. It doesn't look like it should
be _too_ hard to add a folio version.
In many places, this function is used. So this function needs the folio
version.
It's not used in very many places. But the first one that I see it used
(drivers/infiniband/core/umem.c), you can't do a straightforward folio
conversion:

pinned = pin_user_pages_fast(cur_base,
min_t(unsigned long, npages,
PAGE_SIZE /
sizeof(struct page *)),
gup_flags, page_list);
...
ret = sg_alloc_append_table_from_pages(
&umem->sgt_append, page_list, pinned, 0,
pinned << PAGE_SHIFT, ib_dma_max_seg_size(device),
npages, GFP_KERNEL);

That can't be converted to folios. The GUP might start in the middle of
the folio, and we have no way to communicate that.

This particular usage really needs the phyr work that Jason is doing so
we can efficiently communicate physically contiguous ranges from GUP
to sg.

Hi, Matthew

Thanks. To the following function, it seems that no folio function replace vmalloc_to_page.

vmalloc_to_page calls virt_to_page to get page. Finally the followings will be called.

"
(mem_map + ((pfn) - ARCH_PFN_OFFSET))

"

And I do not find the related folio functions with vmalloc_to_page.

And no folio function replaces dma_map_page.

dma_map_page will call dma_map_page_attrs.

Or these 2 function should not be replaced with folio functions?

int irdma_map_vm_page_list(struct irdma_hw *hw, void *va, dma_addr_t *pg_dma,

                           u32 pg_cnt)
{
        struct page *vm_page;
        int i;
        u8 *addr;

        addr = (u8 *)(uintptr_t)va;
        for (i = 0; i < pg_cnt; i++) {
                vm_page = vmalloc_to_page(addr);
                if (!vm_page)
                        goto err;

                pg_dma[i] = dma_map_page(hw->device, vm_page, 0, PAGE_SIZE,
                                         DMA_BIDIRECTIONAL);
                if (dma_mapping_error(hw->device, pg_dma[i]))
                        goto err;

                addr += PAGE_SIZE;
        }

        return 0;

err:
        irdma_unmap_vm_page_list(hw, pg_dma, i);
        return -ENOMEM;

}

Thanks,

Zhu Yanjun


Another problem, after folio is used, I want to know the performance after
folio is implemented.

How to make tests to get the performance?
You know what you're working on ... I wouldn't know how best to test
your code.