[PATCH -next v3 00/25] md: synchronize io with array reconfiguration

From: Yu Kuai
Date: Thu Sep 28 2023 - 02:22:06 EST


From: Yu Kuai <yukuai3@xxxxxxxxxx>

Changes in v3:
- rebase with latest md-next;
- remove patch 2 from v2, and replace it with a new patch;
- fix a null-ptr-derefrence in rdev_attr_store() that mddev is used
before checking;
- merge patch 20-22 from v1 into one patch;
- mddev_lock() used to be called first and can be interruptted, allow new
api, which is called before mddev_lock() now, to be interruptted as well;
- improve some comments and coding;

Changes in v2:
- rebase with latest md-next;
- remove some follow up cleanup patches, these patches will be sent
later after this patchset.

After previous four patchset of preparatory work, this patchset impelement
a new version of mddev_suspend(), the new apis:
- reconfig_mutex is not required;
- the weird logical that suspend array hold 'reconfig_mutex' for
mddev_check_recovery() to update superblock is not needed;
- the special handling, 'pers->prepare_suspend', for raid456 is not
needed;
- It's safe to be called at any time once mddev is allocated, and it's
designed to be used from slow path where array configuration is changed;

And use the new api to replace:

mddev_lock
mddev_suspend or not
// array reconfiguration
mddev_resume or not
mddev_unlock

With:

mddev_suspend
mddev_lock
// array reconfiguration
mddev_unlock
mddev_resume

However, the above change is not possible for raid5 and raid-cluster in
some corner cases, and mddev_suspend/resume() is replaced with quiesce()
callback, which will suspend the array as well.

This patchset is tested in my VM with mdadm testsuite with loop device
except for 10ddf tests(they always fail before this patchset).

A lot of cleanups will be started after this patchset.

Yu Kuai (25):
md: use READ_ONCE/WRITE_ONCE for 'suspend_lo' and 'suspend_hi'
md: replace is_md_suspended() with 'mddev->suspended' in
md_check_recovery()
md: add new helpers to suspend/resume array
md: add new helpers to suspend/resume and lock/unlock array
md: use new apis to suspend array for suspend_lo/hi_store()
md: use new apis to suspend array for level_store()
md: use new apis to suspend array for serialize_policy_store()
md/dm-raid: use new apis to suspend array
md/md-bitmap: use new apis to suspend array for location_store()
md/raid5-cache: use READ_ONCE/WRITE_ONCE for 'conf->log'
md/raid5-cache: use new apis to suspend array for
r5c_disable_writeback_async()
md/raid5-cache: use new apis to suspend array for
r5c_journal_mode_store()
md/raid5: use new apis to suspend array for raid5_store_stripe_size()
md/raid5: use new apis to suspend array for raid5_store_skip_copy()
md/raid5: use new apis to suspend array for
raid5_store_group_thread_cnt()
md/raid5: use new apis to suspend array for
raid5_change_consistency_policy()
md/raid5: replace suspend with quiesce() callback
md: use new apis to suspend array for ioctls involed array
reconfiguration
md: use new apis to suspend array for adding/removing rdev from
state_store()
md: use new apis to suspend array before
mddev_create/destroy_serial_pool
md: cleanup mddev_create/destroy_serial_pool()
md/md-linear: cleanup linear_add()
md: suspend array in md_start_sync() if array need reconfiguration
md: remove old apis to suspend the array
md: rename __mddev_suspend/resume() back to mddev_suspend/resume()

drivers/md/dm-raid.c | 10 +-
drivers/md/md-autodetect.c | 4 +-
drivers/md/md-bitmap.c | 18 ++-
drivers/md/md-linear.c | 2 -
drivers/md/md.c | 233 ++++++++++++++++++++-----------------
drivers/md/md.h | 43 +++++--
drivers/md/raid5-cache.c | 64 +++++-----
drivers/md/raid5.c | 56 ++++-----
8 files changed, 226 insertions(+), 204 deletions(-)

--
2.39.2