Re: [PATCH v2 6/6] shmem: add large folios support to the write path

From: Daniel Gomez
Date: Tue Sep 19 2023 - 12:28:21 EST


On Tue, Sep 19, 2023 at 04:01:19PM +0100, Matthew Wilcox wrote:
> On Tue, Sep 19, 2023 at 01:55:54PM +0000, Daniel Gomez wrote:
> > Add large folio support for shmem write path matching the same high
> > order preference mechanism used for iomap buffered IO path as used in
> > __filemap_get_folio() with a difference on the max order permitted
> > (being PMD_ORDER-1) to respect the huge mount option when large folio
> > is supported.
>
> I'm strongly opposed to "respecting the huge mount option". We're
> determining the best order to use for the folios. Artificially limiting
> the size because the sysadmin read an article from 2005 that said to
> use this option is STUPID.

Then, I would still have the conflict on what to do when the order is
same as huge. I guess huge does not make sense in this new scenario?
unless we add large folios controls as proposal in linux-MM meeting
notes [1]. But I'm missing a bit of context so it's not clear to me
what to do next.

[1] https://lore.kernel.org/all/4966f496-9f71-460c-b2ab-8661384ce626@xxxxxxx/T/#u

In that sense, I wanted to have a big picture of what was this new
strategy implying in terms of folio order when adding to page cache,
so I added tracing for it (same as in readahead). With bpftrace I
can see the following (notes added to explain each field) after running
fsx up to 119M:

@c: 363049108 /* total folio order being traced */
@order[8]: 2 /* order 8 being used 2 times (add_to_page_cache) */
@order[5]: 3249587 */ order 5 being used 3249587 times
(add_to_page_cache) */
@order[4]: 5972205
@order[3]: 8890418
@order[2]: 10380055
@order[0]: 334556841
@order_2: /* linear histogram of folio order */
[0, 1) 334556841 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[1, 2) 0 | |
[2, 3) 10380055 |@ |
[3, 4) 8890418 |@ |
[4, 5) 5972205 | |
[5, 6) 3249587 | |
[6, 7) 0 | |
[7, 8) 0 | |
[8, 9) 2 | |

I guess that's not te best workload to see this but would tracing be also
interesting to add to the series?
>
> > else
> > - folio = shmem_alloc_folio(gfp, info, index, *order);
> > + folio = shmem_alloc_folio(gfp, info, index, order);
>
> Why did you introduce it as *order, only to change it back to order
> in this patch? It feels like you just fixed up patch 6 rather than
> percolating the changes all the way back to where they should have
> been done. This makes the reviewer's life hard.
>

Sorry about that. I missed it in my changes.