Re: [PATCH v1 0/3] introduce priority-based shutdown support

From: Matti Vaittinen
Date: Mon Nov 27 2023 - 09:49:58 EST


On 11/27/23 15:08, Greg Kroah-Hartman wrote:
On Mon, Nov 27, 2023 at 02:54:21PM +0200, Matti Vaittinen wrote:
pe 24. marrask. 2023 klo 19.26 Greg Kroah-Hartman
(gregkh@xxxxxxxxxxxxxxxxxxx) kirjoitti:

On Fri, Nov 24, 2023 at 05:32:34PM +0100, Oleksij Rempel wrote:
On Fri, Nov 24, 2023 at 03:56:19PM +0000, Greg Kroah-Hartman wrote:
On Fri, Nov 24, 2023 at 03:49:46PM +0000, Mark Brown wrote:
On Fri, Nov 24, 2023 at 03:27:48PM +0000, Greg Kroah-Hartman wrote:
On Fri, Nov 24, 2023 at 03:21:40PM +0000, Mark Brown wrote:

This came out of some discussions about trying to handle emergency power
failure notifications.

I'm sorry, but I don't know what that means. Are you saying that the
kernel is now going to try to provide a hard guarantee that some devices
are going to be shut down in X number of seconds when asked? If so, why
not do this in userspace?

No, it was initially (or when I initially saw it anyway) handling of
notifications from regulators that they're in trouble and we have some
small amount of time to do anything we might want to do about it before
we expire.

So we are going to guarantee a "time" in which we are going to do
something? Again, if that's required, why not do it in userspace using
a RT kernel?

For the HW in question I have only 100ms time before power loss. By
doing it over use space some we will have even less time to react.

Why can't userspace react that fast? Why will the kernel be somehow
faster? Speed should be the same, just get the "power is cut" signal
and have userspace flush and unmount the disk before power is gone. Why
can the kernel do this any differently?

In fact, this is not a new requirement. It exist on different flavors of
automotive Linux for about 10 years. Linux in cars should be able to
handle voltage drops for example on ignition and so on. The only new thing is
the attempt to mainline it.

But your patch is not guaranteeing anything, it's just doing a "I want
this done before the other devices are handled", that's it. There is no
chance that 100ms is going to be a requirement, or that some other
device type is not going to come along and demand to be ahead of your
device in the list.

So you are going to have a constant fight among device types over the
years, and people complaining that the kernel is now somehow going to
guarantee that a device is shutdown in a set amount of time, which
again, the kernel can not guarantee here.

This might work as a one-off for a specific hardware platform, which is
odd, but not anything you really should be adding for anyone else to use
here as your reasoning for it does not reflect what the code does.

I was (am) interested in knowing how/where the regulator error
notifications are utilized - hence I asked this in ELCE last summer.
Replies indeed mostly pointed to automotive and handling the under
voltage events.

As to what has changed (I think this was asked in another mail on this
topic) - I understood from the discussions that the demand of running
systems with as low power as possible is even more
important/desirable. Hence, the under-voltage events are more usual
than they were when cars used to be working by burning flammable
liquids :)

Anyways, what I thought I'd comment on is that the severity of the
regulator error notifications can be given from device-tree. Rationale
behind this is that figuring out whether a certain detected problem is
fatal or not (in embedded systems) should be done by the board
designers, per board. Maybe the understanding which hardware should
react first is also a property of hardware and could come from the
device-tree? Eg, instead of having a "DEVICE_SHUTDOWN_PRIO_STORAGE"
set unconditionally for EMMC, systems could set shutdown priority per
board and per device explicitly using device-tree?

Yes, using device tree would be good, but now you have created something
that is device-tree-specific and not all the world is device tree :(

True. However, my understanding is that the regulator subsystem is largely written to work with DT-based systems. Hence supporting the DT-based solution would probably fit to this specific use-case as source of problem notifications is the regulator subsystem.

Also, many devices are finally moving out to non-device-tree busses,
like PCI and USB, so how would you handle them in this type of scheme?

I do readily admit I don't have [all ;) ] the answers. I also think that if we add support for prioritized shutdown on device-tree-based systems, people may eventually want to use this on non device-tree setups too. There may also be other use-cases for prioritized shutdown (Don't know what they would be though).

For now I would leave that to be the problem of the folks who need non device-tree systems when (if) this needs realizes. Assuming there was the handling of priorities in place, the missing piece would then be to find out the place to store this hardware specific priority information. If this is solved for the non DT cases, then the DT-based and non DT-based solutions can co-exist.

Just a suggestion though. I am not working on under-voltage "stuff" right now.

Yours,
-- Matti

--
Matti Vaittinen
Linux kernel developer at ROHM Semiconductors
Oulu Finland

~~ When things go utterly wrong vim users can always type :help! ~~