Deadlock in md

From: Daniel Santos
Date: Wed Aug 29 2018 - 02:32:06 EST


I have not rebooted my system since recovering my data off of an old
raid5 array with an external journal which broke after a crash in
write-back mode (https://www.spinics.net/lists/raid/msg61331.html) and I
noticed this in my kernel log. I had the array assembled in read-only
mode without a journal. When I was done getting my data, I had stopped
the array and zeroed the first 1GiB of each raid member. After this I
modified the partition tables of the disks, but fdisk reported my
/dev/sdb to be in use by the kernel, as did partprobe, presumably due to
a reference leaked from the deadlock.

This is v4.15.67 with an assortment of debugging features enabled, but I
didn't have kmemleak on because I was copying a lot of files (too many
"soft lockups" when it ran).

[203283.385588] md0: detected capacity change from 8796094070784 to 0
[203283.385610] md: md0 stopped.

[203283.430869] ======================================================
[203283.430870] WARNING: possible circular locking dependency detected
[203283.430872] 4.14.67-gentoo-debug #3 Not tainted
[203283.430873] ------------------------------------------------------
[203283.430875] kworker/13:1/480 is trying to acquire lock:
[203283.430876]Â (kn->count#82){++++}, at: [<ffffffffb75c8a3a>]
kernfs_remove+0x1a/0x30
[203283.430883]
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ but task is already holding lock:
[203283.430885]Â ((&mddev->del_work)){+.+.}, at: [<ffffffffb72b7125>]
process_one_work+0x255/0x740
[203283.430889]
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ which lock already depends on the new lock.

[203283.430891]
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ the existing dependency chain (in reverse order) is:
[203283.430892]
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ -> #5 ((&mddev->del_work)){+.+.}:
[203283.430896]ÂÂÂÂÂÂÂ lock_acquire+0xa6/0x160
[203283.430898]ÂÂÂÂÂÂÂ process_one_work+0x2af/0x740
[203283.430900]ÂÂÂÂÂÂÂ worker_thread+0x43/0x690
[203283.430902]ÂÂÂÂÂÂÂ kthread+0x18f/0x260
[203283.430905]ÂÂÂÂÂÂÂ ret_from_fork+0x27/0x50
[203283.430906]
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ -> #4 ("md_misc"){+.+.}:
[203283.430910]ÂÂÂÂÂÂÂ lock_acquire+0xa6/0x160
[203283.430911]ÂÂÂÂÂÂÂ flush_workqueue+0xa6/0xa00
[203283.430914]ÂÂÂÂÂÂÂ __md_stop_writes+0x19/0x130
[203283.430916]ÂÂÂÂÂÂÂ do_md_stop+0x18d/0x700
[203283.430917]ÂÂÂÂÂÂÂ md_ioctl+0xcfb/0x2290
[203283.430920]ÂÂÂÂÂÂÂ blkdev_ioctl+0x6a8/0x1020
[203283.430922]ÂÂÂÂÂÂÂ block_ioctl+0x52/0xa0
[203283.430924]ÂÂÂÂÂÂÂ do_vfs_ioctl+0xc4/0x990
[203283.430926]ÂÂÂÂÂÂÂ SyS_ioctl+0x91/0xa0
[203283.430928]ÂÂÂÂÂÂÂ do_syscall_64+0x83/0x260
[203283.430930]ÂÂÂÂÂÂÂ entry_SYSCALL_64_after_hwframe+0x42/0xb7
[203283.430931]
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ -> #3 (&mddev->open_mutex){+.+.}:
[203283.430934]ÂÂÂÂÂÂÂ lock_acquire+0xa6/0x160
[203283.430936]ÂÂÂÂÂÂÂ __mutex_lock+0x74/0xc60
[203283.430938]ÂÂÂÂÂÂÂ md_open+0x6c/0x140
[203283.430939]ÂÂÂÂÂÂÂ __blkdev_get+0xce/0x560
[203283.430941]ÂÂÂÂÂÂÂ blkdev_get+0x27f/0x500
[203283.430943]ÂÂÂÂÂÂÂ do_dentry_open+0x293/0x4c0
[203283.430945]ÂÂÂÂÂÂÂ path_openat+0x5a3/0xf30
[203283.430947]ÂÂÂÂÂÂÂ do_filp_open+0xb7/0x150
[203283.430948]ÂÂÂÂÂÂÂ do_sys_open+0x1d1/0x2c0
[203283.430950]ÂÂÂÂÂÂÂ do_syscall_64+0x83/0x260
[203283.430952]ÂÂÂÂÂÂÂ entry_SYSCALL_64_after_hwframe+0x42/0xb7
[203283.430953]
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ -> #2 (&bdev->bd_mutex){+.+.}:
[203283.430956]ÂÂÂÂÂÂÂ lock_acquire+0xa6/0x160
[203283.430957]ÂÂÂÂÂÂÂ __mutex_lock+0x74/0xc60
[203283.430959]ÂÂÂÂÂÂÂ __blkdev_get+0x54/0x560
[203283.430960]ÂÂÂÂÂÂÂ blkdev_get+0x27f/0x500
[203283.430962]ÂÂÂÂÂÂÂ blkdev_get_by_dev+0x38/0x60
[203283.430963]ÂÂÂÂÂÂÂ lock_rdev+0x43/0xb0
[203283.430965]ÂÂÂÂÂÂÂ md_import_device+0xf7/0x290
[203283.430966]ÂÂÂÂÂÂÂ add_new_disk+0xe1/0x782
[203283.430968]ÂÂÂÂÂÂÂ md_ioctl+0x1664/0x2290
[203283.430969]ÂÂÂÂÂÂÂ blkdev_ioctl+0x6a8/0x1020
[203283.430971]ÂÂÂÂÂÂÂ block_ioctl+0x52/0xa0
[203283.430972]ÂÂÂÂÂÂÂ do_vfs_ioctl+0xc4/0x990
[203283.430974]ÂÂÂÂÂÂÂ SyS_ioctl+0x91/0xa0
[203283.430976]ÂÂÂÂÂÂÂ do_syscall_64+0x83/0x260
[203283.430978]ÂÂÂÂÂÂÂ entry_SYSCALL_64_after_hwframe+0x42/0xb7
[203283.430978]
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ -> #1 (&mddev->reconfig_mutex){+.+.}:
[203283.430982]ÂÂÂÂÂÂÂ lock_acquire+0xa6/0x160
[203283.430983]ÂÂÂÂÂÂÂ __mutex_lock+0x74/0xc60
[203283.430985]ÂÂÂÂÂÂÂ array_state_store+0xcf/0x4e0
[203283.430986]ÂÂÂÂÂÂÂ md_attr_store+0xa5/0x130
[203283.430988]ÂÂÂÂÂÂÂ kernfs_fop_write+0x1ad/0x300
[203283.430990]ÂÂÂÂÂÂÂ __vfs_write+0x64/0x2c0
[203283.430991]ÂÂÂÂÂÂÂ vfs_write+0xf1/0x2e0
[203283.430993]ÂÂÂÂÂÂÂ SyS_write+0x4e/0xc0
[203283.430994]ÂÂÂÂÂÂÂ do_syscall_64+0x83/0x260
[203283.430996]ÂÂÂÂÂÂÂ entry_SYSCALL_64_after_hwframe+0x42/0xb7
[203283.430997]
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ -> #0 (kn->count#82){++++}:
[203283.431000]ÂÂÂÂÂÂÂ __lock_acquire+0x14c7/0x1c50
[203283.431002]ÂÂÂÂÂÂÂ lock_acquire+0xa6/0x160
[203283.431004]ÂÂÂÂÂÂÂ __kernfs_remove+0x26d/0x390
[203283.431005]ÂÂÂÂÂÂÂ kernfs_remove+0x1a/0x30
[203283.431008]ÂÂÂÂÂÂÂ kobject_del.part.0+0x21/0x70
[203283.431010]ÂÂÂÂÂÂÂ mddev_delayed_delete+0x1f/0x30
[203283.431012]ÂÂÂÂÂÂÂ process_one_work+0x2bc/0x740
[203283.431013]ÂÂÂÂÂÂÂ worker_thread+0x43/0x690
[203283.431015]ÂÂÂÂÂÂÂ kthread+0x18f/0x260
[203283.431017]ÂÂÂÂÂÂÂ ret_from_fork+0x27/0x50
[203283.431018]
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ other info that might help us debug this:

[203283.431020] Chain exists of:
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ kn->count#82 --> "md_misc" --> (&mddev->del_work)

[203283.431023]Â Possible unsafe locking scenario:

[203283.431025]ÂÂÂÂÂÂÂ CPU0ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ CPU1
[203283.431026]ÂÂÂÂÂÂÂ ----ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ ----
[203283.431027]ÂÂ lock((&mddev->del_work));
[203283.431028]ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ lock("md_misc");
[203283.431030]ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ lock((&mddev->del_work));
[203283.431031]ÂÂ lock(kn->count#82);
[203283.431033]
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ *** DEADLOCK ***

[203283.431035] 2 locks held by kworker/13:1/480:
[203283.431036]Â #0:Â ("md_misc"){+.+.}, at: [<ffffffffb72b7125>]
process_one_work+0x255/0x740
[203283.431039]Â #1:Â ((&mddev->del_work)){+.+.}, at:
[<ffffffffb72b7125>] process_one_work+0x255/0x740
[203283.431042]
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ stack backtrace:
[203283.431045] CPU: 13 PID: 480 Comm: kworker/13:1 Not tainted
4.14.67-gentoo-debug #3
[203283.431046] Hardware name: To Be Filled By O.E.M. To Be Filled By
O.E.M./X470 Taichi Ultimate, BIOS P1.20 04/17/2018
[203283.431048] Workqueue: md_misc mddev_delayed_delete
[203283.431050] Call Trace:
[203283.431053]Â dump_stack+0x86/0xcf
[203283.431056]Â print_circular_bug.isra.16+0x24c/0x3e0
[203283.431058]Â check_prev_add.constprop.24+0x541/0x7f0
[203283.431061]Â __lock_acquire+0x14c7/0x1c50
[203283.431064]Â lock_acquire+0xa6/0x160
[203283.431066]Â ? kernfs_remove+0x1a/0x30
[203283.431068]Â __kernfs_remove+0x26d/0x390
[203283.431070]Â ? kernfs_remove+0x1a/0x30
[203283.431072]Â ? sysfs_remove_dir+0x39/0x90
[203283.431074]Â kernfs_remove+0x1a/0x30
[203283.431076]Â kobject_del.part.0+0x21/0x70
[203283.431078]Â mddev_delayed_delete+0x1f/0x30
[203283.431080]Â process_one_work+0x2bc/0x740
[203283.431082]Â ? process_one_work+0x255/0x740
[203283.431084]Â ? worker_thread+0xef/0x690
[203283.431086]Â worker_thread+0x43/0x690
[203283.431088]Â ? process_one_work+0x740/0x740
[203283.431090]Â kthread+0x18f/0x260
[203283.431092]Â ? kthread_create_on_node+0x40/0x40
[203283.431095]Â ret_from_fork+0x27/0x50

Regards,
Daniel