Re: BUG: KCSAN: data-race in btrfs_calculate_inode_block_rsv_size [btrfs] / btrfs_use_block_rsv [btrfs] [EXPERIMENTAL PATCH]

From: Mirsad Todorovac
Date: Thu Sep 21 2023 - 15:55:03 EST


On 9/20/23 17:29, David Sterba wrote:
On Wed, Sep 20, 2023 at 08:18:35AM +0200, Mirsad Todorovac wrote:
Hi,

This is your friendly bug reporter again.

Please don't throw stuff at me, as I found another KCSAN data-race problem.

I feel like a boy who cried wolf ...

I hope this will get some attention, as it looks this time like a real btrfs problem that could cause
the kernel module to make a wrong turn when managing storage in different threads simultaneously and
lead to the corruption of data. However, I do not have an example of this corruption, it is by now only
theoretical in this otherwise great filesystem.

In fact, I can announce quite a number of KCSAN bugs already in dmesg log:

# of
occuren
ces problematic function
-------------------------------------------
182 __bitmap_and+0xa3/0x110
2 __bitmap_weight+0x62/0xa0
138 __call_rcu_common.constprop.0
3 __cgroup_account_cputime
1 __dentry_kill
3 __mod_lruvec_page_state
15 __percpu_counter_compare
1 __percpu_counter_sum+0x8f/0x120
1 acpi_ut_acquire_mutex
2 amdgpu_fence_emit
1 btrfs_calculate_inode_block_rsv_size
1 btrfs_page_set_uptodate
28 copy_from_read_buf
3 d_add
3 d_splice_alias
1 delayacct_add_tsk+0x10d/0x630
7 do_epoll_ctl
1 do_vmi_align_munmap
86 drm_sched_entity_is_ready
4 drm_sched_entity_pop_job
3 enqueue_timer
1 finish_fault+0xde/0x360
2 generic_fillattr
2 getrusage
9 getrusage+0x3ba/0xaa0
1 getrusage+0x3df/0xaa0
6 inode_needs_update_time
1 inode_set_ctime_current
1 inode_update_timestamps
3 kernfs_refresh_inode
22 ktime_get_mono_fast_ns+0x87/0x120
13 ktime_get_mono_fast_ns+0xb0/0x120
24 ktime_get_mono_fast_ns+0xc0/0x120
79 mas_topiary_replace
12 mas_wr_modify
61 mas_wr_node_store
1 memchr_inv+0x71/0x160
1 memchr_inv+0xcf/0x160
19 n_tty_check_unthrottle
5 n_tty_kick_worker
35 n_tty_poll
32 n_tty_read
1 n_tty_read+0x5f8/0xaf0
3 osq_lock
27 process_one_work
4 process_one_work+0x169/0x700
2 rcu_implicit_dynticks_qs
1 show_stat+0x45b/0xb70
3 task_mem
344 tick_nohz_idle_stop_tick
32 tick_nohz_next_event
1 tick_nohz_next_event+0xe7/0x1e0
90 tick_sched_do_timer
5 tick_sched_do_timer+0x2c/0x120
1 wbt_done
1 wbt_issue
2 wq_worker_tick
37 xas_clear_mark

------------------------------------------------------

This report is from a vanilla torvalds tree 6.6-rc2 kernel on Ubuntu 22.04:

[13429.116126] ==================================================================
[13429.116794] BUG: KCSAN: data-race in btrfs_calculate_inode_block_rsv_size [btrfs] / btrfs_use_block_rsv [btrfs]

Thanks for the report. Some data races are known to happen in the
reservation code but all the critical changes are done under locks, so
an optimistic check may skip locking to check a status but then it's
done properly again under a lock. Generally speaking.

We had several reports from static checkers and at least in one case we
added an annotation so KCSAN does not complain:

https://git.kernel.org/linus/748f553c3c4c4f175c6c834358632aff802d72cf

The original report is at

https://lore.kernel.org/linux-btrfs/CAAwBoOJDjei5Hnem155N_cJwiEkVwJYvgN-tQrwWbZQGhFU=cA@xxxxxxxxxxxxxx/

I have briefly looked at your report, it seems to be different from the
one above but still matches the general approach to the reservations. If
it's a false flag then we can add another wrapper with the annotation,
unless it's a real bug.

Thank you for your bug report evaluation.

I cannot do more but pass on what KCSAN provides - my experience with btrfs is far from
required (on the level of fresh user).

However, without attempting to argue, it seems to be possible that there is a data-race,
because the read side is in the function is not protected by a lock, and theoretically
the block_rsv->failfast can change by the write-side thread while the read-side thread
is using various parts of the block_rsv structure w/o a read lock.

Best regards,
Mirsad Todorovac