Re: [dm-devel] [PATCH -next v2 4/6] md: refactor idle/frozen_sync_thread() to fix deadlock

From: Yu Kuai
Date: Wed Jun 14 2023 - 03:38:40 EST


Hi,

在 2023/06/14 15:12, Xiao Ni 写道:
On Wed, Jun 14, 2023 at 10:04 AM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:

Hi,

在 2023/06/14 9:48, Yu Kuai 写道:



In the patch, sync_seq is added in md_reap_sync_thread. In
idle_sync_thread, if sync_seq isn't equal

mddev->sync_seq, it should mean there is someone that stops the sync
thread already, right? Why do

you say 'new started sync thread' here?

If someone stops the sync thread, and new sync thread is not started,
then this sync_seq won't make a difference, above wait_event() will not
wait because !test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) will pass.
So 'sync_seq' is only used when the old sync thread stops and new sync
thread starts, add 'sync_seq' will bypass this case.

Hi

If a new sync thread starts, why can sync_seq be different? sync_seq
is only added in md_reap_sync_thread. And when a new sync request
starts, it can't stop the sync request again?

Af first, the sync_seq is 0

admin1
echo idle > sync_action
idle_sync_thread(sync_seq is 1)

Wait, I'm confused here, how can sync_seq to be 1 here? I suppose you
mean that there is a sync_thread just finished?

Then the problem is that idle_sync_thread() read sync_seq after the old
sync_thread is done, and new sync_thread start before wait_event() is
called, should we wait for this new sync_thread?

My answer here is that we should, but I'm also ok to not wait this new
sync_thread, I don't think this behaviour matters. The key point here
is that once wait_event() is called from idle_sync_thread(), this
wait_event() should not wait for new sync_thread...

echo resync > sync_action (new sync)

If this is behind "echo idle > sync_action", idle_sync_thread should not
see that MD_RECOVERY_RUNNING is set and wait_event() won't wait at all.

Thanks,
Kuai

Then admin2 echos idle > sync_action, sync_seq is still 1

Regards
Xiao


Thanks,
Kuai


.