Re: [PATCH] mm: shrinkers: fix deadlock in shrinker debugfs

From: Roman Gushchin
Date: Fri Feb 03 2023 - 12:47:07 EST


On Thu, Feb 02, 2023 at 06:56:12PM +0800, Qi Zheng wrote:
> The debugfs_remove_recursive() is invoked by unregister_shrinker(),
> which is holding the write lock of shrinker_rwsem. It will waits
> for the handler of debugfs file complete. The handler also needs
> to hold the read lock of shrinker_rwsem to do something. So it
> may cause the following deadlock:
>
> CPU0 CPU1
>
> debugfs_file_get()
> shrinker_debugfs_count_show()/shrinker_debugfs_scan_write()
>
> unregister_shrinker()
> --> down_write(&shrinker_rwsem);
> debugfs_remove_recursive()
> // wait for (A)
> --> wait_for_completion();
>
> // wait for (B)
> --> down_read_killable(&shrinker_rwsem)
> debugfs_file_put() -- (A)
>
> up_write() -- (B)
>
> The down_read_killable() can be killed, so that the above deadlock
> can be recovered. But it still requires an extra kill action,
> otherwise it will block all subsequent shrinker-related operations,
> so it's better to fix it.

Oh, indeed, great catch!

With Andrew's fixup:
Reviewed-by: Roman Gushchin <roman.gushchin@xxxxxxxxx>

Thank you!