Re: [PATCH v3] inotify: Increase default inotify.max_user_watches limit to 1048576

From: Waiman Long
Date: Sun Nov 08 2020 - 21:17:45 EST


On 10/30/20 6:57 AM, Jan Kara wrote:
On Thu 29-10-20 15:42:56, Waiman Long wrote:
The default value of inotify.max_user_watches sysctl parameter was set
to 8192 since the introduction of the inotify feature in 2005 by
commit 0eeca28300df ("[PATCH] inotify"). Today this value is just too
small for many modern usage. As a result, users have to explicitly set
it to a larger value to make it work.

After some searching around the web, these are the
inotify.max_user_watches values used by some projects:
- vscode: 524288
- dropbox support: 100000
- users on stackexchange: 12228
- lsyncd user: 2000000
- code42 support: 1048576
- monodevelop: 16384
- tectonic: 524288
- openshift origin: 65536

Each watch point adds an inotify_inode_mark structure to an inode to
be watched. It also pins the watched inode.

Modeled after the epoll.max_user_watches behavior to adjust the default
value according to the amount of addressable memory available, make
inotify.max_user_watches behave in a similar way to make it use no more
than 1% of addressable memory within the range [8192, 1048576].

For 64-bit archs, inotify_inode_mark plus 2 vfs inode have a size that
is a bit over 1 kbytes (1284 bytes with my x86-64 config). That means
a system with 128GB or more memory will likely have the maximum value
of 1048576 for inotify.max_user_watches. This default should be big
enough for most use cases.

[v3: increase inotify watch cost as suggested by Amir and Honza]

Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
Overall this looks fine. Some remaining nits below.

diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
index 186722ba3894..f8065eda3a02 100644
--- a/fs/notify/inotify/inotify_user.c
+++ b/fs/notify/inotify/inotify_user.c
@@ -37,6 +37,15 @@
#include <asm/ioctls.h>
+/*
+ * An inotify watch requires allocating an inotify_inode_mark structure as
+ * well as pinning the watched inode. Doubling the size of a VFS inode
+ * should be more than enough to cover the additional filesystem inode
+ * size increase.
+ */
+#define INOTIFY_WATCH_COST (sizeof(struct inotify_inode_mark) + \
+ 2 * sizeof(struct inode))
+
/* configurable via /proc/sys/fs/inotify/ */
static int inotify_max_queued_events __read_mostly;
@@ -801,6 +810,18 @@ SYSCALL_DEFINE2(inotify_rm_watch, int, fd, __s32, wd)
*/
static int __init inotify_user_setup(void)
{
+ unsigned int watches_max;
+ struct sysinfo si;
+
+ si_meminfo(&si);
+ /*
+ * Allow up to 1% of addressible memory to be allocated for inotify
^^^^ addressable

+ * watches (per user) limited to the range [8192, 1048576].
+ */
+ watches_max = (((si.totalram - si.totalhigh) / 100) << PAGE_SHIFT) /
+ INOTIFY_WATCH_COST;
^^^ So for machines with > 1TB of memory
watches_max would overflow. So you probably need to use ulong for that.


+ watches_max = min(1048576U, max(watches_max, 8192U));
^^^ use clamp() here?

Yes, it will be easier to read to use clamp() here. Will send out v4 withat those changes.

Thanks,
Longman