Re: [PATCH -next v2 6/6] ext4: fix possible store wrong check interval value in disk when umount

From: yebin
Date: Thu Oct 07 2021 - 23:49:18 EST




On 2021/10/7 21:12, Jan Kara wrote:
On Sat 11-09-21 17:00:59, Ye Bin wrote:
Test follow steps:
1. mkfs.ext4 /dev/sda -O mmp
2. mount /dev/sda /mnt
3. wait for about 1 minute
4. umount mnt
5. debugfs /dev/sda
6. dump_mmp
7. fsck.ext4 /dev/sda

I found 'check_interval' is range in [5, 10]. And sometime run fsck
print "MMP interval is 10 seconds and total wait time is 42 seconds.
Please wait...".
kmmpd:
...
if (diff < mmp_update_interval * HZ)
schedule_timeout_interruptible(mmp_update_interval * HZ - diff);
diff = jiffies - last_update_time;
...
mmp_check_interval = max(min(EXT4_MMP_CHECK_MULT * diff / HZ,
EXT4_MMP_MAX_CHECK_INTERVAL),
EXT4_MMP_MIN_CHECK_INTERVAL);
mmp->mmp_check_interval = cpu_to_le16(mmp_check_interval);
...
We will call ext4_stop_mmpd to stop kmmpd kthread when umount, and
schedule_timeout_interruptible will be interrupted, so 'diff' maybe
little than mmp_update_interval. Then mmp_check_interval will range
in [EXT4_MMP_MAX_CHECK_INTERVAL, EXT4_MMP_CHECK_MULT * diff / HZ].
To solve this issue, if 'diff' little then mmp_update_interval * HZ
just break loop, don't update check interval.

Signed-off-by: Ye Bin <yebin10@xxxxxxxxxx>
---
fs/ext4/mmp.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/fs/ext4/mmp.c b/fs/ext4/mmp.c
index a0d47a906faa..f39e1fa0c6db 100644
--- a/fs/ext4/mmp.c
+++ b/fs/ext4/mmp.c
@@ -205,6 +205,14 @@ static int kmmpd(void *data)
schedule_timeout_interruptible(mmp_update_interval *
HZ - diff);
diff = jiffies - last_update_time;
+ /* If 'diff' little 'than mmp_update_interval * HZ', it
+ * means someone call ext4_stop_mmpd to stop kmmpd
+ * kthread. We don't need to update mmp_check_interval
+ * any more, as 'diff' is not exact value.
+ */
+ if (unlikely(diff < mmp_update_interval * HZ &&
+ kthread_should_stop()))
+ break;
}
So in this case, mmp_check_interval would be EXT4_MMP_MIN_CHECK_INTERVAL. I
don't quite understand what the practical problem is - the fsck message?
That will happen anytime mmp_check_interval is >= 10 AFAICT and I don't
quite see how that is connected to this condition... Can you explain a bit
more please?

Honza
I just think 'mmp_check_interval' is not reflect real check interval, and also sometime run fsck
print "MMP interval is 10 seconds and total wait time is 42 seconds. Please wait...", but
sometime not.