Re: [PATCH] fuse: enable larger read buffers for readdir.

From: Jaco Kroon
Date: Wed Jul 26 2023 - 14:40:25 EST


Hi,

On 2023/07/26 19:23, Antonio SJ Musumeci wrote:
On 7/26/23 10:45, Bernd Schubert wrote:
On 7/26/23 17:26, Jaco Kroon wrote:
Hi,

On 2023/07/26 15:53, Bernd Schubert wrote:
On 7/26/23 12:59, Jaco Kroon wrote:
Signed-off-by: Jaco Kroon <jaco@xxxxxxxxx>
---
  fs/fuse/Kconfig   | 16 ++++++++++++++++
  fs/fuse/readdir.c | 42 ++++++++++++++++++++++++------------------
  2 files changed, 40 insertions(+), 18 deletions(-)

diff --git a/fs/fuse/Kconfig b/fs/fuse/Kconfig
index 038ed0b9aaa5..0783f9ee5cd3 100644
--- a/fs/fuse/Kconfig
+++ b/fs/fuse/Kconfig
@@ -18,6 +18,22 @@ config FUSE_FS
        If you want to develop a userspace FS, or if you want to use
        a filesystem based on FUSE, answer Y or M.
  +config FUSE_READDIR_ORDER
+    int
+    range 0 5
+    default 5
+    help
+        readdir performance varies greatly depending on the size of
the read.
+        Larger buffers results in larger reads, thus fewer reads and
higher
+        performance in return.
+
+        You may want to reduce this value on seriously constrained
memory
+        systems where 128KiB (assuming 4KiB pages) cache pages is
not ideal.
+
+        This value reprents the order of the number of pages to
allocate (ie,
+        the shift value).  A value of 0 is thus 1 page (4KiB) where
5 is 32
+        pages (128KiB).
+
I like the idea of a larger readdir size, but shouldn't that be a
server/daemon/library decision which size to use, instead of kernel
compile time? So should be part of FUSE_INIT negotiation?
Yes sure, but there still needs to be a default.  And one page at a time
doesn't cut it.
With FUSE_INIT userspace would make that decision, based on what kernel
fuse suggests? process_init_reply() already handles other limits - I
don't see why readdir max has to be compile time option. Maybe a module
option to set the limit?

Thanks,
Bernd
I had similar question / comment. This seems to me to be more
appropriately handed by the server via FUSE_INIT.

And wouldn't "max" more easily be FUSE_MAX_MAX_PAGES? Is there a reason
not to allow upwards of 256 pages sized readdir buffer?

Will look into FUSE_INIT.  The FUSE_INIT as I understand from what I've read has some expansion constraints or the structure is somehow negotiated.  Older clients in other words that's not aware of the option will follow some default.  For backwards compatibility that default should probably be 1 page.  For performance reasons it makes sense that this limit be larger.

glibc uses a 128KiB buffer for getdents64, so I'm not sure >128KiB here makes sense.  Or if these two buffers are even directly related.

Default to fc->max_pages (which defaults to 32 or 128KiB) if the user-space side doesn't understand the max_readdir_pages limit aspect of things?  Assuming these limits should be set separately.  I'm thinking piggy backing on fc->max_pages is just fine to be honest.

For the sake of avoiding division and modulo operations in the cache, I'm thinking round-down max_pages to the closest power of two for the sake of sticking to binary operators rather than divisions and mods?

Current patch introduces a definite memory leak either way.  Tore through about 12GB of RAM in a matter of 20 minutes or so.  Just going to patch it that way first, and then based on responses above will look into an alternative patch that does not depend on a compile-time option.  Guessing __free_page should be a multi-page variant now.

Thanks for all the feedback so far.

Kind regards,
Jaco