Re: [RFC] msr: judge the return val of function rdmsrl_on_cpu() by WARN_ON

From: lizhe . 67
Date: Thu Jun 29 2023 - 22:46:31 EST


On Fri, 30 Jun 2023 at 01:03:22 +0200, tglx@xxxxxxxxxxxxx wrote:
>Li!
>
>On Thu, Jun 29 2023 at 15:27, lizhe.67@xxxxxxxxxxxxx wrote:
>
>> There are ten places call rdmsrl_on_cpu() in the current code without
>> judging the return value. This may introduce a potential bug. For example,
>> inj_bank_set() may return -EINVAL, show_base_frequency() may show an error
>> freq value, intel_pstate_hwp_set() may write an error value to the related
>> msr register and so on. But rdmsrl_on_cpu() do rarely returns an error, so
>> it seems that add a WARN_ON is enough for debugging.
>
>Can you please structure your changelogs as documented in:
>
> https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#changelog
>
>instead of providing a big lump of words?
>
>> There are ten places call rdmsrl_on_cpu() in the current code without
>> judging the return value.
>
>Return values are not judged. They are either ignored or checked/evaluated.
>
>> This may introduce a potential bug.
>
>Sure. Anything which does not check a return value from a function might
>be a bug, but you have to look at each instance whether its a bug or
>not.
>
>> For example, inj_bank_set() may return -EINVAL, show_base_frequency()
>> may show an error freq value, intel_pstate_hwp_set() may write an
>> error value to the related msr register and so on. But
>> rdmsrl_on_cpu() do rarely returns an error, so it seems that add a
>> WARN_ON is enough for debugging.
>
>This is hillarious at best.
>
> 1) It does not matter at all whether that function returns an error rarely
> or not.
>
> 2) Adding WARN_ON() without justification at each call site is not
> enough. Neither for debugging nor for real world usage.
>
>You have to come up with individual patches for each callsite to add the
>WARN_ON() and in each patch you have to explain why it is justified and
>why there is no other solution, e.g. taking an error exit path.
>
>Just slapping WARN_ON()'s into the code without any deeper analysis is
>worse than the current state of the code.
>
>If you have identified a real world problem at any of these call sites
>then adding a WARN_ON() does not solve it at all.
>
>I'm looking forward to your profound anlysis of each of these "problems".
>
>Thanks,
>
> tglx

Thanks for all your advice. I will analysis each of these "problems".