[PATCH] f2fs: quota: fix potential deadlock

From: Chao Yu
Date: Thu Sep 02 2021 - 23:03:06 EST


As Yi Zhuang reported in bugzilla:

https://bugzilla.kernel.org/show_bug.cgi?id=214299

There is potential deadlock during quota data flush as below:

Thread A: Thread B:
f2fs_dquot_acquire
down_read(&sbi->quota_sem)
f2fs_write_checkpoint
block_operations
f2fs_look_all
down_write(&sbi->cp_rwsem)
f2fs_quota_write
f2fs_write_begin
__do_map_lock
f2fs_lock_op
down_read(&sbi->cp_rwsem)
__need_flush_qutoa
down_write(&sbi->quota_sem)

This patch changes block_operations() to use trylock, if it fails,
it means there is potential quota data updater, in this condition,
let's flush quota data first and then trylock again to check dirty
status of quota data.

The side effect is: in heavy race condition (e.g. multi quota data
upaters vs quota data flusher), it may decrease the probability of
synchronizing quota data successfully in checkpoint() due to limited
retry time of quota flush.

Reported-by: Yi Zhuang <zhuangyi1@xxxxxxxxxx>
Signed-off-by: Chao Yu <chao@xxxxxxxxxx>
---
fs/f2fs/checkpoint.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 7d8803a4cbc2..6f6a7d812d60 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -1159,7 +1159,8 @@ static bool __need_flush_quota(struct f2fs_sb_info *sbi)
if (!is_journalled_quota(sbi))
return false;

- down_write(&sbi->quota_sem);
+ if (!down_write_trylock(&sbi->quota_sem))
+ return true;
if (is_sbi_flag_set(sbi, SBI_QUOTA_SKIP_FLUSH)) {
ret = false;
} else if (is_sbi_flag_set(sbi, SBI_QUOTA_NEED_REPAIR)) {
--
2.32.0