Re: [PATCH -next v5 6/6] md: protect md_thread with rcu

From: Yu Kuai
Date: Mon Apr 10 2023 - 21:08:27 EST


Hi,

在 2023/04/10 23:42, Logan Gunthorpe 写道:


On 2023-04-10 05:35, Yu Kuai wrote:
From: Yu Kuai <yukuai3@xxxxxxxxxx>

Our test reports a uaf for 'mddev->sync_thread':

T1 T2
md_start_sync
md_register_thread
// mddev->sync_thread is set
raid1d
md_check_recovery
md_reap_sync_thread
md_unregister_thread
kfree

md_wakeup_thread
wake_up
->sync_thread was freed

Root cause is that there is a small windown between register thread and
wake up thread, where the thread can be freed concurrently.

Currently, a global spinlock 'pers_lock' is borrowed to protect
'mddev->thread', this problem can be fixed likewise, however, there might
be similar problem elsewhere, and use a global lock for all the cases is
not good.

This patch protect md_thread with rcu.

Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx>
---
drivers/md/md-bitmap.c | 29 ++++++++++++-----
drivers/md/md.c | 68 +++++++++++++++++++---------------------
drivers/md/md.h | 10 +++---
drivers/md/raid1.c | 4 +--
drivers/md/raid1.h | 2 +-
drivers/md/raid10.c | 10 ++++--
drivers/md/raid10.h | 2 +-
drivers/md/raid5-cache.c | 15 +++++----
drivers/md/raid5.c | 4 +--
drivers/md/raid5.h | 2 +-
10 files changed, 81 insertions(+), 65 deletions(-)

diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
index 29fd41ef55a6..b9baeea5605e 100644
--- a/drivers/md/md-bitmap.c
+++ b/drivers/md/md-bitmap.c
@@ -1219,15 +1219,27 @@ static bitmap_counter_t *md_bitmap_get_counter(struct bitmap_counts *bitmap,
int create);
static void mddev_set_timeout(struct mddev *mddev, unsigned long timeout,
- bool force)
+ bool force, bool protected)
{
- struct md_thread *thread = mddev->thread;
+ struct md_thread *thread;
+
+ if (!protected) {
+ rcu_read_lock();
+ thread = rcu_dereference(mddev->thread);
+ } else {
+ thread = rcu_dereference_protected(mddev->thread,
+ lockdep_is_held(&mddev->reconfig_mutex));
+ }

Why not just always use rcu_read_lock()? Even if it's safe with
reconfig_mutex, it wouldn't harm much and would make the code a bit less
ugly.

Of course, I'll do that in next version.


@@ -458,8 +454,10 @@ static void md_submit_bio(struct bio *bio)
*/
void mddev_suspend(struct mddev *mddev)
{
- WARN_ON_ONCE(mddev->thread && current == mddev->thread->tsk);
- lockdep_assert_held(&mddev->reconfig_mutex);
+ struct md_thread *thread = rcu_dereference_protected(mddev->thread,
+ lockdep_is_held(&mddev->reconfig_mutex));

Do we know that reconfig_mutex is always held when we call
md_unregister_thread()? Seems plausible, but maybe it's worth adding a
lockdep_assert_held() to md_unregsiter_thread().

Unfortunally this is not true for now, md_unregister_thread() can be
called without this mutex from action_store(), and this is problematic,
I'm tring to revert this change in the other thread:

md: fix that MD_RECOVERY_RUNNING can be cleared while sync_thread is
still running.

I think it's not good to add lockdep_assert_held() for now...

Thanks,
Kuai

Thanks,

Logan
.