Re: [PATCH] 9p/client: fix data race on req->status

From: Christian Schoenebeck
Date: Mon Dec 05 2022 - 10:20:04 EST


On Monday, December 5, 2022 1:47:56 PM CET Dominique Martinet wrote:
> KCSAN reported a race between writing req->status in p9_client_cb and
> accessing it in p9_client_rpc's wait_event.
>
> Accesses to req itself is protected by the data barrier (writing req
> fields, write barrier, writing status // reading status, read barrier,
> reading other req fields), but status accesses themselves apparently
> also must be annotated properly with WRITE_ONCE/READ_ONCE when we
> access it without locks.
>
> Follows:
> - error paths writing status in various threads all can notify
> p9_client_rpc, so these all also need WRITE_ONCE
> - there's a similar read loop in trans_virtio for zc case that also
> needs READ_ONCE
> - other reads in trans_fd should be protected by the trans_fd lock and
> lists state machine, as corresponding writers all are within trans_fd
> and should be under the same lock. If KCSAN complains on them we likely
> will have something else to fix as well, so it's better to leave them
> unmarked and look again if required.
>
> Reported-by: Naresh Kamboju <naresh.kamboju@xxxxxxxxxx>
> Suggested-by: Marco Elver <elver@xxxxxxxxxx>
> Signed-off-by: Dominique Martinet <asmadeus@xxxxxxxxxxxxx>

I must have missed the prior discussion, but looking at the suggested
solution: if there is no lock, then adding READ_ONCE() and WRITE_ONCE() would
not fix cross-CPU issues, as those would not have a memory barrier in that
case.

Shouldn't that therefore rather be at least smp_load_acquire() and
smp_store_release() at such places instead?

Best regards,
Christian Schoenebeck