Re: [PATCH v2 2/3] hwmon: (pmbus/core): Add regulator event support

From: Matti Vaittinen
Date: Tue Apr 11 2023 - 02:10:23 EST


On 4/10/23 20:47, Guenter Roeck wrote:
On Mon, Apr 10, 2023 at 07:53:46PM +0300, Matti Vaittinen wrote:
On 4/10/23 17:40, Guenter Roeck wrote:
On 4/10/23 01:19, Matti Vaittinen wrote:
to 6. huhtik. 2023 klo 16.43 Mark Brown (broonie@xxxxxxxxxx) kirjoitti:

On Thu, Apr 06, 2023 at 11:00:02AM +0300, Matti Vaittinen wrote:
ke 5. huhtik. 2023 klo 18.19 Mark Brown (broonie@xxxxxxxxxx) kirjoitti:
On Wed, Apr 05, 2023 at 07:18:32AM -0700, Guenter Roeck wrote:

It can also try to avoid
interacting with hardware if that might not work.

It'd be great to have documentation / specification for sending and/or
handling the regulator events. I don't think we currently have such.
As far as I understand, the notifications can be picked up by all
consumers of a regulator. I am a bit worried about:
a) Situations where notification handlers 'collide' by doing 'actions'
which are unexpected by other handlers

I'm not sure what you're expecting there?  A device working with itself
shouldn't disrupt any other users.

I have no concrete idea, just a vague uneasy feeling knowing that
devices tend to interact with each other. I guess it is more about the
amount of uncertainty caused by my lack of knowledge regarding what
could be done by these handlers. So, as I already said - if no one
else is bothered by this then I definitely don't want to block the
series. Still, if the error handling should be kept internal to PMBus
- then we should probably either say that consumer drivers must not
(forcibly) turn off the supply when receiving these notifications - or
not send these notifications from PMBus and allow PMBus to decide
error handling internally. (Again, I don't know if any in-tree
consumer drivers do turn off the supply regulator in error handlers -
but I don't think it is actually forbidden). Or am I just making  a
problem that does not exist?


For my part I (still) don't understand why this is considered a problem
for this driver but not for all the other drivers reporting various
error conditions to the regulator subsystem. At least some of them
also have programmable reaction to such error conditions.

I may not know the drivers you're referring to. And, as I said, maybe there
is no problem.

To explain why I asked this question for this driver:

What caught my attention was use of the REGULATOR_EVENT_*_WARN flags, which
were originally added so regulators could be flagging 'not yet
critical'-errors Vs. the other, older REGULATOR_EVENT_* flags. From the
discussions I have understood the older flags were informing severe hardware
failures where system is typically thought to be no longer usable. I have
understood that the most likely handling for such notification is shutting
off the regulator. I have further understood that there are not many
consumers doing handling of these events. (This is all just my understanding
based on discussions - take it with grain of salt).

I was the one who added these warning level notifications. Thus, I have been
following (only) use of those warnings. I have no proper insight to existing
drivers using all these flags.

When grepping for the WARNING level regulator events I can find following
drivers:
drivers/regulator/bd9576-regulator.c
drivers/regulator/max597x-regulator.c
drivers/regulator/tps65219-regulator.c

The difference (in my head) between these and PMBus-core is that these are
very specific PMIC ICs. The board designer should have a good understanding
which of the power-rails may have 'warning level' failures, and which errors
are 'critical'. They should select and set the IRQ limits and error
severities in the device-tree accordingly.

PMBus core (again, in my head) is more generic purpose system. This is why I
originally asked if the 'error severity' in PMBus specifies also the error
handling - and if these regulator notifications map to intended handling.
Now, after this discussion I think that:

PMBus has it's own error-handling which is implemented independently from
these notifications. This handling should not be messed-up by regulator
consumers, for example by touching the regulator state.

This is what made me think sending regulator notifications might not be the
best approach - (but as I said, I may be wrong. I am no longer sure what
kind of handling there is expected for these events. Furthermore, as we see
below, I did not find in-tree consumers taking "radical" actions when
receiving the notifications).

Yep, I didn't find other in-tree consumers for these notifications except:
drivers/usb/host/ohci-da8xx.c

(I was not thorough so may have missed some, but seems there is not many
in-tree consumers.)

I did only a very very brief shallow peek but it seems to me that even there
driver only sets a flag - which is used in debug message. (I may have missed
something here as well).

Judging this it seems to me that, at the moment, these notifications are
mostly ignored by consumers - and they are sent by only a handful of
devices. There probably exist some downstream users for those, but I have
not heard of them. Maybe they are only used on very specific systems. This
could explain why there has been no problems.

I know, I know. Lot's of guessing, assuming and hand waving. Sorry :/


Oh, now the problem (though I still don't understand what the problem
actually is) is extended to warnings. I thought you were concerned
about errors.

Yes. I am concerned about errors - as I thought handling errors might have an impact to the regulator supply state. I don't see this as a problem with warnings.

It was the use of warnings that got this series in my mailbox. Hence this series caught my attention.

The only thing I could be concerned about with warnings is whether they are sent for problems that are actually errors. But even if they were, that would not really cause any issues outside the specific use-case. So, no. I am not really concerned about warnings but I tried to explain why I was commenting on this PMBus series as you asked that.

Personally I think you are concerned about a non-issue, but without
explicit guidance from regulator maintainers (and no clear definition if
and when regulator notifications should or may be sent) I won't be able
to apply this series.

Thanks Guenter. I value your opinion on this. (And there was _no_ sarcasm. I've learned that I am often discussing with people who have much broader understanding than I do, so I do truly appreciate hearing your opinion).

I think part of the problem is that we lack of insight as to how these notifications / errors are used. I am trying to clarify this. As a matter of fact, I today received an email that my talk proposal to ELCE was accepted. I am going to go through these notifications and error flags there, and I hope to be able to collect some feed-back as to how they are / can be utilized.

Having said this, I find the whole discussion kind of surreal,

Sorry about that. What do you think I should've done differently? Do you think I should not have brought-up these questions? I have openly admitted I may not understand all the details and thus You as a maintainer are free to act as if I never asked a thing. I've explained why I think there is a potential problem - but if you or others do not see the issue, then it's your call to decide.

especially since the PMBus drivers already report error states to the
regulator subsystem using the get_error_flags callback, but it is what
it is. Also, no, I won't revert that code without a clear explanation
of the _actual_ (not perceived) problem that such a revert is supposed
to solve.

I agree with you that these notifications and error flags go hand-by-hand. Actually, the use of (warning) notifications does more or less require using also the error flags (as we don't have a notification when things get back to normal). The flags do not require using notifications though, some consumers can do polling of flags.

Let's hope no problems with these error flags surface.

Yours,
-- Matti

--
Matti Vaittinen
Linux kernel developer at ROHM Semiconductors
Oulu Finland

~~ When things go utterly wrong vim users can always type :help! ~~