Re: [PATCH v2] proc: report open files as size in stat() for /proc/pid/fd

From: Ivan Babrou
Date: Mon Oct 17 2022 - 15:01:01 EST


On Thu, Sep 22, 2022 at 3:40 PM Ivan Babrou <ivan@xxxxxxxxxxxxxx> wrote:
>
> Many monitoring tools include open file count as a metric. Currently
> the only way to get this number is to enumerate the files in /proc/pid/fd.
>
> The problem with the current approach is that it does many things people
> generally don't care about when they need one number for a metric.
> In our tests for cadvisor, which reports open file counts per cgroup,
> we observed that reading the number of open files is slow. Out of 35.23%
> of CPU time spent in `proc_readfd_common`, we see 29.43% spent in
> `proc_fill_cache`, which is responsible for filling dentry info.
> Some of this extra time is spinlock contention, but it's a contention
> for the lock we don't want to take to begin with.
>
> We considered putting the number of open files in /proc/pid/status.
> Unfortunately, counting the number of fds involves iterating the open_files
> bitmap, which has a linear complexity in proportion with the number
> of open files (bitmap slots really, but it's close). We don't want
> to make /proc/pid/status any slower, so instead we put this info
> in /proc/pid/fd as a size member of the stat syscall result.
> Previously the reported number was zero, so there's very little
> risk of breaking anything, while still providing a somewhat logical
> way to count the open files with a fallback if it's zero.
>
> RFC for this patch included iterating open fds under RCU. Thanks
> to Frank Hofmann for the suggestion to use the bitmap instead.
>
> Previously:
>
> ```
> $ sudo stat /proc/1/fd | head -n2
> File: /proc/1/fd
> Size: 0 Blocks: 0 IO Block: 1024 directory
> ```
>
> With this patch:
>
> ```
> $ sudo stat /proc/1/fd | head -n2
> File: /proc/1/fd
> Size: 65 Blocks: 0 IO Block: 1024 directory
> ```
>
> Correctness check:
>
> ```
> $ sudo ls /proc/1/fd | wc -l
> 65
> ```
>
> I added the docs for /proc/<pid>/fd while I'm at it.
>
> Signed-off-by: Ivan Babrou <ivan@xxxxxxxxxxxxxx>
>
> ---
> v2: Added missing rcu_read_lock() / rcu_read_unlock(),
> task_lock() / task_unlock() and put_task_struct().
> ---
> Documentation/filesystems/proc.rst | 17 ++++++++++++
> fs/proc/fd.c | 44 ++++++++++++++++++++++++++++++
> 2 files changed, 61 insertions(+)

Now that Linux 6.1-rc1 is out, should this patch be looked at for
inclusion? I see that the net-next tree has opened, not sure if the
same rules apply here.

We've been running the v2 version of this patch in production
successfully for some time now.