Re: [PATCH 2/2] scsi: ufs: Protect PM ops and err_handler from user access through sysfs

From: Can Guo
Date: Mon Jan 11 2021 - 04:24:30 EST


On 2021-01-11 16:23, Bean Huo wrote:
On Mon, 2021-01-11 at 09:27 +0800, Can Guo wrote:
>
> If accessing sysfs nodes, which triggers a UFS UPIU request to
> read/write UFS device descriptors during shutdown flow, there is
> only
> one issue that sysfs node access failure since UFS device and LINK
> has
> been shutdown. Strictly speaking, the failure comes after
> ufshcd_set_dev_pwr_mode().
>
> __ufshcd_query_descriptor: opcode 0x01 for idn 0 failed, index
> 0,
> err = -11

You misunderstood it again. You are expecting a simple query cmd
error.
But what really matters are NoC issues[1] and OCP[2]. And
while/after
UFS
shutting down, either of them may happen.

[1] When a un-clocked register access issue happens, we call it a
NoC
issue,
meaning you are tring to access a register when clocks are disabled.
This
leads to system CRASH.


OK, let it simple, share this kind of crash log becuase of access sysfs
node in the shutdown flow.


[2] OCP is over current protection. While UFS shutting down, you may
have put UFS regulators to LPM. After that, if you are still trying
to
talk to UFS, OCP can happen on VCCQ/VCCQ2. This leads to system
CRASH
too.

the same as above, share the crash log.


If you have hand-on experiences on NoC and/or OCP issues, you won't ask
for the crash log. The tricky parts about critial NoC and OCP issues is
that they don't print much logs (maybe no logs at all) in uart, which is
why they are hard to debug and why I add another flag to help debug.

Take OCP as an example, when OCP happens on VCCQ/VCCQ2, PMIC will do a
hard reset and put the system to crash dump mode (this is a general design
in our mutual customers, but it may vary platform by platform).

And if you have a crash dump tool to collect the dump, after the dump is
parsed, the best part which you can count on is the last callstacks and
related flags, variables in hba. The callstack is pretty much same with
the one I shared with my debug patch applied during the weekend.

Stanley can correct me if I am wrong.

Thanks,
Can Guo.


>
> Since the shutdown is oneway process, this failure is not big
> issue. If
> you meant to avoid this failure for unsafe shutdown, I agree with
> you,
> But for the race issue, I don't know.
>

Easy for you to say. System crash is a big issue to any SoC vendors
I
belive.


indeed, crash is serious issue, share the log.


Thanks,
Bean


Can Guo.