Re: BUG: kworker + systemd-udevd memory leaks found in 6.1.0-rc4

From: Mirsad Goran Todorovac
Date: Tue Nov 29 2022 - 04:59:40 EST


On 29.11.2022. 9:36, Greg KH wrote:
On Tue, Nov 29, 2022 at 04:35:10AM +0100, Mirsad Goran Todorovac wrote:
On 10. 11. 2022. 10:20, Greg KH wrote:
On Thu, Nov 10, 2022 at 05:57:57AM +0100, Mirsad Goran Todorovac wrote:
On 04. 11. 2022. 11:40, Mirsad Goran Todorovac wrote:

Dear Sirs,

When building a RPM 6.1.0-rc3 for AlmaLinux 8.6, I have enabled
CONFIG_DEBUG_KMEMLEAK=y
and the result showed an unreferenced object in kworker process:

# cat /sys/kernel/debug/kmemleak
unreferenced object 0xffffa01dabff6100 (size 16):
  comm "kworker/u12:4", pid 400, jiffies 4294894771 (age 5284.956s)
  hex dump (first 16 bytes):
    6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00 memstick0.......
  backtrace:
    [<000000009ff951f6>] __kmem_cache_alloc_node+0x380/0x4e0
    [<00000000451f4268>] __kmalloc_node_track_caller+0x55/0x150
    [<0000000005472512>] kstrdup+0x36/0x70
    [<000000002f797ac4>] kstrdup_const+0x28/0x30
    [<00000000e3f86581>] kvasprintf_const+0x78/0xa0
    [<00000000e15920f7>] kobject_set_name_vargs+0x23/0xa0
    [<000000004158a6c0>] dev_set_name+0x53/0x70
    [<000000001a120541>] memstick_check+0xff/0x384 [memstick]
    [<00000000122bb894>] process_one_work+0x214/0x3f0
    [<00000000fcf282cc>] worker_thread+0x34/0x3d0
    [<0000000002409855>] kthread+0xed/0x120
    [<000000007b02b4a3>] ret_from_fork+0x1f/0x30
unreferenced object 0xffffa01dabff6ec0 (size 16):
  comm "kworker/u12:4", pid 400, jiffies 4294894774 (age 5284.944s)
  hex dump (first 16 bytes):
    6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00 memstick0.......
  backtrace:
    [<000000009ff951f6>] __kmem_cache_alloc_node+0x380/0x4e0
    [<00000000451f4268>] __kmalloc_node_track_caller+0x55/0x150
    [<0000000005472512>] kstrdup+0x36/0x70
    [<000000002f797ac4>] kstrdup_const+0x28/0x30
    [<00000000e3f86581>] kvasprintf_const+0x78/0xa0
    [<00000000e15920f7>] kobject_set_name_vargs+0x23/0xa0
    [<000000004158a6c0>] dev_set_name+0x53/0x70
    [<000000001a120541>] memstick_check+0xff/0x384 [memstick]
    [<00000000122bb894>] process_one_work+0x214/0x3f0
    [<00000000fcf282cc>] worker_thread+0x34/0x3d0
    [<0000000002409855>] kthread+0xed/0x120
    [<000000007b02b4a3>] ret_from_fork+0x1f/0x30
#

Please fing the build config and lshw output attached.

dmesg is useless, as it is filled with events like:

[ 6068.996120] evbug: Event. Dev: input4, Type: 1, Code: 31, Value: 0
[ 6068.996121] evbug: Event. Dev: input4, Type: 0, Code: 0, Value: 0
[ 6069.124145] evbug: Event. Dev: input4, Type: 4, Code: 4, Value: 458762
[ 6069.124149] evbug: Event. Dev: input4, Type: 1, Code: 34, Value: 1
[ 6069.124150] evbug: Event. Dev: input4, Type: 0, Code: 0, Value: 0
[ 6069.196003] evbug: Event. Dev: input4, Type: 4, Code: 4, Value: 458762
[ 6069.196007] evbug: Event. Dev: input4, Type: 1, Code: 34, Value: 0
[ 6069.196009] evbug: Event. Dev: input4, Type: 0, Code: 0, Value: 0
[ 6069.788129] evbug: Event. Dev: input4, Type: 4, Code: 4, Value: 458792
[ 6069.788133] evbug: Event. Dev: input4, Type: 1, Code: 28, Value: 1
[ 6069.788135] evbug: Event. Dev: input4, Type: 0, Code: 0, Value: 0

This bug is confirmed in 6.1-rc4, among the "thermald" and "systemd-dev"
kernel memory leaks, potentially exposing race conditions or other more
serious bug.

How is a memory leak a race condition?

The bug is now also confirmed and now manifested also in the Ubuntu 22.04
LTS jammy 6.1-rc4 build.

Here is the kmemleak output:

unreferenced object 0xffff9242b13b3980 (size 64):
  comm "kworker/5:3", pid 43106, jiffies 4305052439 (age 71828.792s)
  hex dump (first 32 bytes):
    80 8b a0 f0 42 92 ff ff 00 00 00 00 00 00 00 00 ....B...........
    20 86 a0 f0 42 92 ff ff 00 00 00 00 00 00 00 00 ...B...........
  backtrace:
    [<00000000c5dea4db>] __kmem_cache_alloc_node+0x380/0x4e0
    [<000000002b17af47>] kmalloc_node_trace+0x27/0xa0
    [<000000004c09eee5>] xhci_alloc_command+0x6e/0x180

This is a totally different backtrace from above, how are they related?

This looks like a potential xhci issue. Can you use 'git bisect' to
track down the offending change that caused this?

thanks,

greg k-h

Hello, Greg, Thorsten!

After multiple attempts, my box's UEFI refuses to run pre-4.17 kernels.
The bisect shows the problem appeared before 4.17, so unless I find what is
causing black screen when booting pre-4.17 kernels, it's a no-go ... :(

Ok, so I guess this has always been an issue, and is not a regression,
which is good. Can you work with the memstick developers to find a
solution?

Hi, Greg,

Of course, I will gladly cooperate with the memstick team.
I will CC: everyone with commits to the memstick driver, I hope that's not too awkward.

So far, the Code of Conduct says to inform the maintainers about the bug.

BTW, the bug is confirmed as unfixed in 6.1-rc7:

# cat /sys/kernel/debug/kmemleak
unreferenced object 0xffff93e548ab1e90 (size 16):
comm "kworker/u12:5", pid 405, jiffies 4294894087 (age 65919.068s)
hex dump (first 16 bytes):
6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00 memstick0.......
backtrace:
[<00000000942f1553>] __kmem_cache_alloc_node+0x380/0x4e0
[<00000000555b3e8a>] __kmalloc_node_track_caller+0x55/0x140
[<000000000b60a98a>] kstrdup+0x36/0x70
[<00000000f9a4a52a>] kstrdup_const+0x28/0x30
[<000000005c5ca378>] kvasprintf_const+0x78/0xa0
[<00000000b8f94e41>] kobject_set_name_vargs+0x23/0xa0
[<00000000b7a2c8ea>] dev_set_name+0x53/0x70
[<00000000291af717>] memstick_check+0xff/0x384 [memstick]
[<000000007b776e48>] process_one_work+0x214/0x3f0
[<000000005791f9b2>] worker_thread+0x34/0x3d0
[<00000000df696ef8>] kthread+0xed/0x120
[<0000000016f05dd5>] ret_from_fork+0x1f/0x30
#

Thanks,
Mirsad

--
Mirsad Todorovac
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb
Republic of Croatia, the European Union
--
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu