Re: [PATCH v5 2/2] counter: add IRQ or GPIO based event counter

From: William Breathitt Gray
Date: Tue Feb 23 2021 - 21:35:00 EST


On Tue, Feb 23, 2021 at 06:45:16PM +0100, Oleksij Rempel wrote:
> Hello William,
>
> Here is cooled down technical answer. Excuse me for over reacting.

Hello Oleksij,

Let me apologize too if I offended you in some way in with previous
response, I assure you that was not my intention. I truly do believe
this is a useful driver to have in the kernel and I want to make that
happen; my concerns with your patch are purely technical in nature and
I'm certain we can find a solution working together.

> On Tue, Feb 23, 2021 at 11:06:56AM +0100, Oleksij Rempel wrote:
> > On Mon, Feb 22, 2021 at 10:43:00AM +0900, William Breathitt Gray wrote:
> > > On Mon, Feb 15, 2021 at 10:17:37AM +0100, Oleksij Rempel wrote:
> > > > > > +static irqreturn_t event_cnt_isr(int irq, void *dev_id)
> > > > > > +{
> > > > > > + struct event_cnt_priv *priv = dev_id;
> > > > > > +
> > > > > > + atomic_inc(&priv->count);
> > > > >
> > > > > This is just used to count the number of interrupts right? I wonder if
> > > > > we can do this smarter. For example, the kernel already keeps track of
> > > > > number of interrupts that has occurred for any particular IRQ line on a
> > > > > CPU (see the 'kstat_irqs' member of struct irq_desc, and the
> > > > > show_interrupts() function in kernel/irq/proc.c). Would it make sense to
> > > > > simply store the initial interrupt count on driver load or enablement,
> > > > > and then return the difference during a count_read() callback?
> > > >
> > > > This driver do not makes a lot of sense without your chardev patches. As
> > > > soon as this patches go mainline, this driver will be able to send
> > > > event with a timestamp and counter state to the user space.
> > > >
> > > > With other words, we will need an irq handler anyway. In this case we
> > > > can't save more RAM or CPU cycles by using system irq counters.
> > >
> > > It's true that this driver will need an IRQ handler when the timestamp
> > > functionality is added, but deriving the count value is different matter
> > > regardless. There's already code in the kernel to retrieve the number of
> > > interrupts, so it makes sense that we use that rather than rolling our
> > > own -- at the very least to ensure the value we provide to users is
> > > consistent with the ones already provided by other areas of the kernel.
>
> The value provided by the driver is consistent only if it is not
> overwritten by user. The driver provides an interface to reset/overwrite it.
> At least after this step the value is not consistent.

I wasn't clear here so I apologize. What I would like is for this driver
to maintain its own local count value derived from kstat_irqs_usr(). So
for example, you can use the "count" member of your struct
interrupt_cnt_priv to maintain this value (it can be unsigned int
instead of atomic_t):

static int interrupt_cnt_read(struct counter_device *counter,
struct counter_count *count, unsigned long *val)
{
struct interrupt_cnt_priv *priv = counter->priv;

*val = kstat_irqs_usr(priv->irq) - priv->count;

return 0;
}

static int interrupt_cnt_write(struct counter_device *counter,
struct counter_count *count,
const unsigned long val)
{
struct interrupt_cnt_priv *priv = counter->priv;

/* kstat_irqs_usr() returns unsigned int */
if (val != (unsigned int)val)
return -ERANGE;

priv->count = val;

return 0;
}

> > We are talking about one or two code lines. If we will take some
> > duplication search engine, it will find that major part of the kernel
> > is matching against it.
> >
> > Newer the less, this driver provides a way to reset the counter. Why
> > should we drop this functionality no advantage?
> >
> > > To that end, I'd like to see your cnt_isr() function removed for this
> > > patchset (you can bring it back once timestamp support is added).
>
> It make no sense to request an interrupt without interrupt service
> routine.
>
> https://elixir.bootlin.com/linux/latest/source/kernel/irq/manage.c#L2072if
> if (!handler) {
> if (!thread_fn)
> return -EINVAL;
>
> As you can see, requesting an irq need at least handler or thread_fn.
>
> enable_irq: this will explode at least here:
> https://elixir.bootlin.com/linux/latest/source/kernel/irq/manage.c#L778
>
> If he have no IRQ handler and some how was able to enable it, at
> some point this IRQ will be disabled by this code:
> https://elixir.bootlin.com/linux/latest/source/kernel/irq/spurious.c#L410
> if (unlikely(desc->irqs_unhandled > 99900)) {
> /*
> * The interrupt is stuck
> */
> __report_bad_irq(desc, action_ret);
> /*
> * Now kill the IRQ
> */
> printk(KERN_EMERG "Disabling IRQ #%d\n", irq);
> desc->istate |= IRQS_SPURIOUS_DISABLED;
> desc->depth++;
> irq_disable(desc);
>
> With current code, we can't request or enable IRQ without cnt_isr(). Not
> that it is not possible, but it make no sense to me.

What I'm requesting is to remove the interrupt code from this driver for
now including the cnt_enable_write() callback. Yes, we will need it when
timestamp functionality is added, but currently the Counter subsystem
does not have that functionality yet. Once the Counter character device
interface is merged, then it makes sense to add the interrupt service
routine to push timestamps to the user.

It is still useful to have this driver mainlined even without the
interrupt code: getting the body of this driver merged means a much
easier review of the timestamp code in the future, and users can start
using current Counter sysfs interface to track their GPIO interrupts.

> > Are you suggesting to enable IRQ without interrupt handler? May be i'm
> > missing some thing.. I do not understand it.
> >
> > > Reimplement your cnt_read/cnt_write() functions to instead use
> > > kstat_irqs_usr() from <linux/kernel_stat.h> to get the current number of
> > > interrupts the IRQ line and use it to derive your count value for this
> > > driver.
>
> irq descriptor has 3 counters:
> - irq_count: this value can be reset any time by the kernel at least by
> the note_interrupt()
> - irqs_unhandled: this value is increased in case of missing irq
> handler. Or if handler has returns IRQ_NONE.
> - tot_count: this value should not be reset.
>
> Non of this values is suitable for cnt_read() and cnt_write(). Only
> tot_count would be suitable if cnt_write() is removed. I do not see it
> as acceptable option.
>
> For this driver, we still need extra counter, where only this driver is
> responsible for writing to it.

Yes, I'm sorry for not being clear before. Please use priv->count for
this; there's no need to adjust directly the system irq count.

> > I can follow the counter read way, but overwriting system wide counter
> > for local use is bad idea.
> >
> > > > > > +static struct counter_signal event_cnt_signals[] = {
> > > > > > + {
> > > > > > + .id = 0,
> > > > > > + .name = "Channel 0 signal",
> > > > >
> > > > > You should choose a more description name for this Signal;
> > > > > "Channel 0 signal" isn't very useful information for the user. Is this
> > > > > signal the respective GPIO line state?
> > > >
> > > > Sounds plausible. How about "Channel 0, GPIO line state"?
> > >
> > > Ideally, this would match the GPIO name (or I suppose the IRQ number if
> > > not a GPIO line). So in your probe() function you can do something like
> > > this I believe:
> > >
> > > cnt_signals[0].name = priv->gpio->name;
> >
>
> > to make this possible, i would need hack gpiolib framework and add
> > name/label exporter. But after endless rounds of pingponging me for
> > renaming the driver and removing interrupt handler, i feel like we are
> > not having serious discussion for mainlining this driver.
>
> Probably for good reason, struct gpio_desc was made local and is located
> in the drivers/gpio/gpiolib.h. It feels like additional hack to include
> it. I assume, it should be done properly so there is a function to
> provide gpio name or label.
>
> @Linus Walleij are there any good way to get the GPIO name? And which
> name will be actually used? A label provided over devicetree?

Perhaps one of the GPIO subsystem maintainers can provide more guidance
here, but I was under the impression that this name was provided
statically by the respective GPIO driver via their struct gpio_chip. I
think you can see the array of names via priv->gpio->gdev->chip->names.

Alternatively, we can take a more generic approach: ignore the GPIO
names and focus solely on the IRQ lines; because the GPIO lines will
always be tied to respective IRQ lines here, using the IRQ as the basis
of the name should always be valid. The "name" member of the struct
irq_chip can work for this. I haven't tested this, but I think something
like this would work:

cnt_signals[0].name = irq_get_chip(priv->irq)->name;

> If I see it correctly, it would need more work to make the kernel infrastructure
> suitable for this suggestions. Some of them are only needed before
> chardev support will go mainline and , in long term, not worth to
> spend time on it.

I disagree, I think there is benefit in getting this driver merged
even if we don't have the interrupt service routine. Although I
recommend we keep this initial patch simple to introduce the driver,
later on you can for example add support for other Counter sysfs
attributes such as "ceiling" and "floor" if users want to specify count
limits, or perhaps alternative count functions (maybe a user wants to
the count to decrease instead of increase with every interrupt).

These other functionalities are tangental to the your timestamp interest
for this driver, but I believe they will be useful to users at large as
a convenient way to evaluate, track, and express the interrupt counts on
their system.

William Breathitt Gray

> Probably I do not understand you and i missing some thing?
>
> Regards,
> Oleksij
> --
> Pengutronix e.K. | |
> Steuerwalder Str. 21 | http://www.pengutronix.de/ |
> 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
> Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |

Attachment: signature.asc
Description: PGP signature