Re: [PATCH v2] inotify: Increase default inotify.max_user_watches limit to 1048576

From: Amir Goldstein
Date: Thu Oct 29 2020 - 14:47:09 EST


On Thu, Oct 29, 2020 at 8:05 PM Waiman Long <longman@xxxxxxxxxx> wrote:
>
> On 10/29/20 1:27 PM, Amir Goldstein wrote:
> > On Thu, Oct 29, 2020 at 5:46 PM Waiman Long <longman@xxxxxxxxxx> wrote:
> >> The default value of inotify.max_user_watches sysctl parameter was set
> >> to 8192 since the introduction of the inotify feature in 2005 by
> >> commit 0eeca28300df ("[PATCH] inotify"). Today this value is just too
> >> small for many modern usage. As a result, users have to explicitly set
> >> it to a larger value to make it work.
> >>
> >> After some searching around the web, these are the
> >> inotify.max_user_watches values used by some projects:
> >> - vscode: 524288
> >> - dropbox support: 100000
> >> - users on stackexchange: 12228
> >> - lsyncd user: 2000000
> >> - code42 support: 1048576
> >> - monodevelop: 16384
> >> - tectonic: 524288
> >> - openshift origin: 65536
> >>
> >> Each watch point adds an inotify_inode_mark structure to an inode to
> >> be watched. It also pins the watched inode as well as an inotify fdinfo
> >> procfs file.
> >>
> >> Modeled after the epoll.max_user_watches behavior to adjust the default
> >> value according to the amount of addressable memory available, make
> >> inotify.max_user_watches behave in a similar way to make it use no more
> >> than 1% of addressable memory within the range [8192, 1048576].
> >>
> >> For 64-bit archs, inotify_inode_mark plus 2 inode have a size close
> >> to 2 kbytes. That means a system with 196GB or more memory should have
> >> the maximum value of 1048576 for inotify.max_user_watches. This default
> >> should be big enough for most use cases.
> >>
> >> With my x86-64 config, the size of xfs_inode, proc_inode and
> >> inotify_inode_mark is 1680 bytes. The estimated INOTIFY_WATCH_COST is
> >> 1760 bytes.
> >>
> >> [v2: increase inotify watch cost as suggested by Amir and Honza]
> >>
> >> Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
> >> ---
> >> fs/notify/inotify/inotify_user.c | 24 +++++++++++++++++++++++-
> >> 1 file changed, 23 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
> >> index 186722ba3894..37d9f09c226f 100644
> >> --- a/fs/notify/inotify/inotify_user.c
> >> +++ b/fs/notify/inotify/inotify_user.c
> >> @@ -37,6 +37,16 @@
> >>
> >> #include <asm/ioctls.h>
> >>
> >> +/*
> >> + * An inotify watch requires allocating an inotify_inode_mark structure as
> >> + * well as pinning the watched inode and adding inotify fdinfo procfs file.
> > Maybe you misunderstood me.
> > There is no procfs file per watch.
> > There is a procfs file per inotify_init() fd.
> > The fdinfo of that procfile lists all the watches of that inotify instance.
> Thanks for the clarification. Yes, I probably had misunderstood you
> because of the 2 * sizeof(inode) figure you provided.
> >> + * The increase in size of a filesystem inode versus a VFS inode varies
> >> + * depending on the filesystem. An extra 512 bytes is added as rough
> >> + * estimate of the additional filesystem inode cost.
> >> + */
> >> +#define INOTIFY_WATCH_COST (sizeof(struct inotify_inode_mark) + \
> >> + 2 * sizeof(struct inode) + 512)
> >> +
> > I would consider going with double the sizeof inode as rough approximation for
> > filesystem inode size.
> >
> > It is a bit less arbitrary than 512 and it has some rationale behind it -
> > Some kernel config options will grow struct inode (debug, smp)
> > The same config options may also grow the filesystem part of the inode.
> >
> > And this approximation can be pretty accurate at times.
> > For example, on Ubuntu 18.04 kernel 5.4.0:
> > inode_cache 608
> > nfs_inode_cache 1088
> > btrfs_inode 1168
> > xfs_inode 1024
> > ext4_inode_cache 1096
>
> Just to clarify, is your original 2 * sizeof(struct inode) figure
> include the filesystem inode overhead or there is an additional inode
> somewhere that I needs to go to 4 * sizeof(struct inode)?

No additional inode.

#define INOTIFY_WATCH_COST (sizeof(struct inotify_inode_mark) + \
2 * sizeof(struct inode))

Not sure if the inotify_inode_mark part matters, but it doesn't hurt.
Do note that Jan had a different proposal for fs inode size estimation (1K).
I have no objection to this estimation if Jan insists.

Thanks,
Amir.