Re: [PATCH] gpiolib: Avoid side effects in gpio_is_visible()

From: Chris Packham
Date: Tue May 16 2023 - 21:07:34 EST



On 17/05/23 12:47, Kent Gibson wrote:
> On Tue, May 16, 2023 at 11:50:42PM +0000, Chris Packham wrote:
>> Hi Kent,
>>
>> On 17/05/23 10:47, Kent Gibson wrote:
>>> On Tue, May 16, 2023 at 10:19:14PM +0000, Chris Packham wrote:
>>>> On 17/05/23 01:57, Linus Walleij wrote:
>>>>> On Mon, May 15, 2023 at 12:27 AM Chris Packham
>>>>> <Chris.Packham@xxxxxxxxxxxxxxxxxxx> wrote:
>>>>>
>>>>>> In my original case which is a kernel module that exports a GPIO for
>>>>>> userspace using gpiod_export()
>>>>> We should not add new users for that API as it increase the usage
>>>>> of the sysfs ABI but if it's an existing in-tree usecase I buy it.
>>>>>
>>>>>> The crux of the problem is that the irq_desc is created when it hasn't
>>>>>> been requested.
>>>>> The right solution to me seems to be to not use gpiod_export()
>>>>> and not use sysfs TBH.
>>>> That's not really a feasible solution. I'm dealing with application code
>>>> that has been happily using the sysfs interface for many years.
>>>>
>>>> I actually did look at getting that code updated to use libgpio earlier
>>>> this year but found the API was in a state of flux and I wasn't going to
>>>> recommend re-writing the application code to use libgpio if I knew the
>>>> API was going to change and we'd have to re-write it again.
>>>>
>>> Its 'libgpiod'.
>>>
>>>> Even now with the 2.0.1 libgpio there doesn't seem to be a way of asking
>>>> about just GPIO lines in the system, application code would still need
>>>> to open every /dev/gpiochipN device and ask what lines are on the chip
>>>> and keep the fds open for the chips that have lines the application
>>>> cares about but make sure to close the fd for the ones that don't. So
>>>> now the application code has to care about GPIO chips in addition to the
>>>> GPIO lines.
>>>>
>>> Trying to better understand your use case - how does your application
>>> identify lines of interest - just whatever lines pop up in
>>> /sys/class/gpio?
>> Thanks for taking an interest. We actually have 2 applications that make
>> use of this functionality
>>
>> The first is a userspace driver for a Power Over Ethernet Controller+PSE
>> chipset (I'll refer to this as an MCU since the thing we talk to is
>> really a micro controller with a vendor supplied firmware on it that
>> does most of the PoE stuff). Communication to the MCU is based around
>> commands sent via i2c. But there are a few extra GPIOs that are used to
>> reset the MCU as well as provide a mechanism for quickly dropping power
>> on certain events (e.g. if the temperature monitoring detects a problem).
>>
>> We do have a small kernel module that grabs the GPIOs based on the
>> device tree and exports them with a known names (e.g. "poe-reset",
>> "poe-dis") that the userspace driver can use. Back when that code was
>> written we did consider not exporting the GPIOs and instead having some
>> other sysfs/ioctl interface into this kernel module but that seemed more
>> work than just calling gpiod_export() for little gain. This is where
>> adding the gpio-names property in our .dts would allow libgpiod to do
>> something similar.
>>
> Ah, so you use gpio_export_link() to provide the well known name?

Yes correct. I did wonder at one point about proposing a dts property to
automagically export the gpio with a better name than gpio1234 (like a
gpio-hog but allowing userspace to poke at it) but I'm pretty sure that
would have been rebuffed.

>> Having the GPIOs in sysfs is also convenient as we can have a systemd
>> ExecStopPost script that can drop power and/or reset the MCU if our
>> application crashes. I'm not sure if the GPIO chardev interface deals
>> with releasing the GPIO lines if the process that requested them exits
>> abnormally (I assume it does) and obviously our ExecStopPost script
>> would need updating to use some of the libgpiod applications to do what
>> it currently does with a simple 'echo 1 >.../poe-reset'
>>
> Ironically, the usual complaint wrt power/reset lines is that users
> don't want it to be reset back to default when their app crashes.

Yeah it's impossible to please everyone. Might be the kind of thing that
could be set by the application when requesting the line (e.g make this
high on close(), leave it as-is).

> What happens when the line is released is driver dependent.
> The uAPI can't make any guarantees, as it releases the line back to the
> device driver. Typically is it set back to its default state, so that
> might do exactly what you want out of the box - no ExecStopPost required.
> But you would need to confirm on your hardware.
With a pca9555 I think it'd just stay in whatever state was last driven.
It might go back to an input which would let the pull-ups take over.
> There was also some discussion on making the default state configurable
> via dts[1], but I'm not sure what happened to that.
>
>> The second application is a userspace driver for a L3 network switch
>> (actually two of them for different silicon vendors). Again this needs
>> to deal with resets for PHYs connected to the switch that the kernel has
>> no visibility of as well as the GPIOs for the SFP cages. Again we have a
>> slightly less simple kernel module that grabs all these GPIOs and
>> exports them with known names. This time there are considerably more of
>> these GPIOs (our largest system currently has 96 SFP+ ports with 4 GPIOs
>> per port) so we're much more reliant on being able to do things like
>> `for x in port*tx-dis; do echo 1 >$x; done`
>>
> Given appropriate line names, that is already something you can do with
> the libgpiod v2 tools. Something like:
>
> `for x in gpiochip*; do gpioset -c x tx-dis=1; done`
Would that deal with the fact the GPIO lines are port1-tx-dis,
port2-tx-dis, ... port96-tx-dis?
> Behind the scenes gpioset is doing the name to offset mapping, which is
> less efficent than identifying the line by offset, but given you are
> calling from shell performance probably isn't an issue.

Yeah it's a script of last resort so it's performance is not too
critical. Probably on-par with file globing of individual lines anyway.

>
>> I'm sure both of these applications could be re-written around libgpiod
>> but that would incur a significant amount of regression testing on
>> existing platforms. And I still consider dealing with GPIO chips an
>> extra headache that the applications don't need (particularly with the
>> sheer number of them the SFP case).
>>
> Strictly speaking you have regression testing to deal with which ever
> way you go. Though wouldn't regression testing for a kernel change be more
> work than the app alone?

Well there's testing the existing app on new hardware vs testing the
re-written app on all existing hardware and the new hardware. But that's
always the trade off with pretty much any system wide improvement.