Re: [PATCH v2] inotify: Increase default inotify.max_user_watches limit to 1048576

From: Amir Goldstein
Date: Thu Oct 29 2020 - 13:27:22 EST


On Thu, Oct 29, 2020 at 5:46 PM Waiman Long <longman@xxxxxxxxxx> wrote:
>
> The default value of inotify.max_user_watches sysctl parameter was set
> to 8192 since the introduction of the inotify feature in 2005 by
> commit 0eeca28300df ("[PATCH] inotify"). Today this value is just too
> small for many modern usage. As a result, users have to explicitly set
> it to a larger value to make it work.
>
> After some searching around the web, these are the
> inotify.max_user_watches values used by some projects:
> - vscode: 524288
> - dropbox support: 100000
> - users on stackexchange: 12228
> - lsyncd user: 2000000
> - code42 support: 1048576
> - monodevelop: 16384
> - tectonic: 524288
> - openshift origin: 65536
>
> Each watch point adds an inotify_inode_mark structure to an inode to
> be watched. It also pins the watched inode as well as an inotify fdinfo
> procfs file.
>
> Modeled after the epoll.max_user_watches behavior to adjust the default
> value according to the amount of addressable memory available, make
> inotify.max_user_watches behave in a similar way to make it use no more
> than 1% of addressable memory within the range [8192, 1048576].
>
> For 64-bit archs, inotify_inode_mark plus 2 inode have a size close
> to 2 kbytes. That means a system with 196GB or more memory should have
> the maximum value of 1048576 for inotify.max_user_watches. This default
> should be big enough for most use cases.
>
> With my x86-64 config, the size of xfs_inode, proc_inode and
> inotify_inode_mark is 1680 bytes. The estimated INOTIFY_WATCH_COST is
> 1760 bytes.
>
> [v2: increase inotify watch cost as suggested by Amir and Honza]
>
> Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
> ---
> fs/notify/inotify/inotify_user.c | 24 +++++++++++++++++++++++-
> 1 file changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
> index 186722ba3894..37d9f09c226f 100644
> --- a/fs/notify/inotify/inotify_user.c
> +++ b/fs/notify/inotify/inotify_user.c
> @@ -37,6 +37,16 @@
>
> #include <asm/ioctls.h>
>
> +/*
> + * An inotify watch requires allocating an inotify_inode_mark structure as
> + * well as pinning the watched inode and adding inotify fdinfo procfs file.

Maybe you misunderstood me.
There is no procfs file per watch.
There is a procfs file per inotify_init() fd.
The fdinfo of that procfile lists all the watches of that inotify instance.

> + * The increase in size of a filesystem inode versus a VFS inode varies
> + * depending on the filesystem. An extra 512 bytes is added as rough
> + * estimate of the additional filesystem inode cost.
> + */
> +#define INOTIFY_WATCH_COST (sizeof(struct inotify_inode_mark) + \
> + 2 * sizeof(struct inode) + 512)
> +

I would consider going with double the sizeof inode as rough approximation for
filesystem inode size.

It is a bit less arbitrary than 512 and it has some rationale behind it -
Some kernel config options will grow struct inode (debug, smp)
The same config options may also grow the filesystem part of the inode.

And this approximation can be pretty accurate at times.
For example, on Ubuntu 18.04 kernel 5.4.0:
inode_cache 608
nfs_inode_cache 1088
btrfs_inode 1168
xfs_inode 1024
ext4_inode_cache 1096


Thanks,
Amir.