Re: [PATCH 2/3] wifi: ath9k: fix races between ath9k_wmi_cmd and ath9k_wmi_ctrl_rx

From: Fedor Pchelkin
Date: Mon Apr 24 2023 - 15:12:18 EST


This problem is realy subtle, I suppose. In the v2 commit info, which I'll
send in the next mail, the race condition is described which can lead to
invalid behaviour.

Couldn't reproduce that particular problem on real hardware, but if
force timeouts to wmi cmd completions, local KMSan catches some uninit
values.

The synchronization between ath9k_wmi_cmd and ath9k_wmi_ctrl_rx on
timeouts is good, especially after 8a2f35b98306 ("wifi: ath9k: Fix
potential stack-out-of-bounds write in ath9k_wmi_rsp_callback()").

And I think the only place where the fuzzer can provoke failure is when
wmi->last_seq_id in callback is checked before it is assigned zero inside
ath9k_wmi_cmd() during timeout exit. This scenario is more thoroughly
described in patch v2.

Well, the issue seems to be rare and I don't know how to properly test it
on real hardware.

I've made some checks on a basic driver workflow, and there weren't any
stalls or explicit failures, and the patch seems to close that tiny race
condition window. But, anyway, it requires more discussion.