Aw: Re: [External] : nfsd: memory leak when client does many file operations

From: Jan Schunk
Date: Tue Mar 26 2024 - 15:07:26 EST


Thanks, yes this was a packaged kernel, I will try it with my own build later.

On an earlier test run I saved slabinfo to a file sometimes. On Kernel 6.6x I can see nfsd_file <active_objs> and <num_objs> is growing from 72 to 324 within 14 hours. But I can not compare it to older kernels since there is no nfsd_file in the list.

top - 00:49:49 up 3 min, 1 user, load average: 0,21, 0,19, 0,09
Tasks: 111 total, 1 running, 110 sleeping, 0 stopped, 0 zombie
%CPU(s): 0,2 us, 0,3 sy, 0,0 ni, 99,5 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st
MiB Spch: 467,0 total, 302,3 free, 89,3 used, 88,1 buff/cache
MiB Swap: 975,0 total, 975,0 free, 0,0 used. 377,7 avail Spch

slabinfo
# name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
nfsd_file 72 72 112 36 1 : tunables 0 0 0 : slabdata 2 2 0

top - 15:05:39 up 14:19, 1 user, load average: 1,87, 1,72, 1,65
Tasks: 104 total, 1 running, 103 sleeping, 0 stopped, 0 zombie
%CPU(s): 0,2 us, 4,9 sy, 0,0 ni, 53,3 id, 39,0 wa, 0,0 hi, 2,6 si, 0,0 st
MiB Spch: 467,0 total, 21,2 free, 147,1 used, 310,9 buff/cache
MiB Swap: 975,0 total, 952,9 free, 22,1 used. 319,9 avail Spch

slabinfo
# name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
nfsd_file 324 324 112 36 1 : tunables 0 0 0 : slabdata 9 9 0


> Gesendet: Dienstag, den 26.03.2024 um 18:15 Uhr
> Von: "Benjamin Coddington" <bcodding@xxxxxxxxxx>
> An: "Jan Schunk" <scpcom@xxxxxx>
> Cc: "Chuck Lever III" <chuck.lever@xxxxxxxxxx>, "Jeff Layton" <jlayton@xxxxxxxxxx>, "Neil Brown" <neilb@xxxxxxx>, "Olga Kornievskaia" <kolga@xxxxxxxxxx>, "Dai Ngo" <dai.ngo@xxxxxxxxxx>, "Tom Talpey" <tom@xxxxxxxxxx>, "Linux NFS Mailing List" <linux-nfs@xxxxxxxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx
> Betreff: Re: [External] : nfsd: memory leak when client does many file operations
>
> On 26 Mar 2024, at 13:13, Benjamin Coddington wrote:
>
> > On 26 Mar 2024, at 13:04, Jan Schunk wrote:
> >
> >> Before I start doing this on my own build I tried it with unmodified linux-image-6.6.13+bpo-amd64 from Debian 12.
> >> I installed systemtap, linux-headers-6.6.13+bpo-amd64 and linux-image-6.6.13+bpo-amd64-dbg and tried to run stap:
> >>
> >> user@deb:~$ sudo stap -v --all-modules kmem_alloc.stp nfsd_file
> >> WARNING: Kernel function symbol table missing [man warning::symbols]
> >> Pass 1: parsed user script and 484 library scripts using 110120virt/96896res/7168shr/89800data kb, in 1360usr/1080sys/4963real ms.
> >> WARNING: cannot find module kernel debuginfo: No DWARF information found [man warning::debuginfo]
> >> semantic error: resolution failed in DWARF builder
> >>
> >> semantic error: while resolving probe point: identifier 'kernel' at kmem_alloc.stp:5:7
> >> source: probe kernel.function("kmem_cache_alloc") {
> >> ^
> >>
> >> semantic error: no match
> >>
> >> Pass 2: analyzed script: 1 probe, 5 functions, 1 embed, 3 globals using 112132virt/100352res/8704shr/91792data kb, in 30usr/30sys/167real ms.
> >> Pass 2: analysis failed. [man error::pass2]
> >> Tip: /usr/share/doc/systemtap/README.Debian should help you get started.
> >> user@deb:~$
> >>
> >> user@deb:~$ grep -E 'CONFIG_DEBUG_INFO|CONFIG_KPROBES|CONFIG_DEBUG_FS|CONFIG_RELAY' /boot/config-6.6.13+bpo-amd64
> >> CONFIG_RELAY=y
> >> CONFIG_KPROBES=y
> >> CONFIG_KPROBES_ON_FTRACE=y
> >> CONFIG_DEBUG_INFO=y
> >> # CONFIG_DEBUG_INFO_NONE is not set
> >> CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
> >> # CONFIG_DEBUG_INFO_DWARF4 is not set
> >> # CONFIG_DEBUG_INFO_DWARF5 is not set
> >> # CONFIG_DEBUG_INFO_REDUCED is not set
> >> CONFIG_DEBUG_INFO_COMPRESSED_NONE=y
> >> # CONFIG_DEBUG_INFO_COMPRESSED_ZLIB is not set
> >> # CONFIG_DEBUG_INFO_SPLIT is not set
> >> CONFIG_DEBUG_INFO_BTF=y
> >> CONFIG_DEBUG_INFO_BTF_MODULES=y
> >> CONFIG_DEBUG_FS=y
> >> CONFIG_DEBUG_FS_ALLOW_ALL=y
> >> # CONFIG_DEBUG_FS_DISALLOW_MOUNT is not set
> >> # CONFIG_DEBUG_FS_ALLOW_NONE is not set
> >> user@deb:~$
> >>
> >> Do I need to enable other options?
> >
> > You should just need DEBUG_INFO.. maybe stap can't find it? You can try to add: -r /path/to/the/kernel/build
>
> oh, nevermind - you're using a packaged kernel. I'm no familiar with the packaged requirements for systemtap on debian.
>
> Ben
>