[PATCH RFC 4/4] UNFINISHED mm, fs: use kmem_cache_charge() in path_openat()

From: Vlastimil Babka
Date: Fri Mar 01 2024 - 12:07:53 EST


This is just an example of using the kmem_cache_charge() API. I think
it's placed in a place that's applicable for Linus's example [1]
although he mentions do_dentry_open() - I have followed from strace()
showing openat(2) to path_openat() doing the alloc_empty_file().

The idea is that filp_cachep stops being SLAB_ACCOUNT. Allocations that
want to be accounted immediately can use GFP_KERNEL_ACCOUNT. I did that
in alloc_empty_file_noaccount() (despite the contradictory name but the
noaccount refers to something else, right?) as IIUC it's about
kernel-internal opens.

alloc_empty_file() is now not doing the accounting, so I added
kmem_account_file() that calls the new kmem_cache_charge() API.

Why is this unfinished:

- there are other callers of alloc_empty_file() which I didn't adjust so
they simply became memcg-unaccounted. I haven't investigated for which
ones it would make also sense to separate the allocation and accounting.
Maybe alloc_empty_file() would need to get a parameter to control
this.

- I don't know how to properly unwind the accounting failure case. It
seems like a new case because when we succeed the open, there's no
further error path at least in path_openat().

Basically it boils down I'm unfamiliar with VFS so this depends if this
approach is deemed useful enough to finish it.

[1] https://lore.kernel.org/all/CAHk-=whYOOdM7jWy5jdrAm8LxcgCMFyk2bt8fYYvZzM4U-zAQA@xxxxxxxxxxxxxx/

Not-signed-off-by: Vlastimil Babka <vbabka@xxxxxxx>
---
fs/file_table.c | 9 +++++++--
fs/internal.h | 1 +
fs/namei.c | 4 +++-
3 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index b991f90571b4..6401b6f175ae 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -223,6 +223,11 @@ struct file *alloc_empty_file(int flags, const struct cred *cred)
return ERR_PTR(-ENFILE);
}

+int kmem_account_file(struct file *f)
+{
+ return kmem_cache_charge(filp_cachep, GFP_KERNEL_ACCOUNT, f);
+}
+
/*
* Variant of alloc_empty_file() that doesn't check and modify nr_files.
*
@@ -234,7 +239,7 @@ struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred)
struct file *f;
int error;

- f = kmem_cache_zalloc(filp_cachep, GFP_KERNEL);
+ f = kmem_cache_zalloc(filp_cachep, GFP_KERNEL_ACCOUNT);
if (unlikely(!f))
return ERR_PTR(-ENOMEM);

@@ -468,7 +473,7 @@ void __init files_init(void)
{
filp_cachep = kmem_cache_create("filp", sizeof(struct file), 0,
SLAB_TYPESAFE_BY_RCU | SLAB_HWCACHE_ALIGN |
- SLAB_PANIC | SLAB_ACCOUNT, NULL);
+ SLAB_PANIC, NULL);
percpu_counter_init(&nr_files, 0, GFP_KERNEL);
}

diff --git a/fs/internal.h b/fs/internal.h
index b67406435fc0..06ada11b71d0 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -96,6 +96,7 @@ extern void chroot_fs_refs(const struct path *, const struct path *);
struct file *alloc_empty_file(int flags, const struct cred *cred);
struct file *alloc_empty_file_noaccount(int flags, const struct cred *cred);
struct file *alloc_empty_backing_file(int flags, const struct cred *cred);
+int kmem_account_file(struct file *file);

static inline void file_put_write_access(struct file *file)
{
diff --git a/fs/namei.c b/fs/namei.c
index 4e0de939fea1..fcf3f3fcd059 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3799,8 +3799,10 @@ static struct file *path_openat(struct nameidata *nd,
terminate_walk(nd);
}
if (likely(!error)) {
- if (likely(file->f_mode & FMODE_OPENED))
+ if (likely(file->f_mode & FMODE_OPENED)) {
+ kmem_account_file(file);
return file;
+ }
WARN_ON(1);
error = -EINVAL;
}

--
2.44.0