Re: [PATCH 1/4] genirq: Provide basic NMI management for interrupt lines

From: Ricardo Neri
Date: Wed Aug 01 2018 - 22:08:34 EST


On Wed, Aug 01, 2018 at 04:09:59PM +0100, Julien Thierry wrote:
> >>+static bool irq_supports_nmi(struct irq_desc *desc)
> >>+{
> >>+ struct irq_data *d = irq_desc_get_irq_data(desc);
> >>+
> >>+#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY
> >>+ /* Only IRQs directly managed by the root irqchip can be set as NMI */
> >>+ if (d->parent_data)
> >>+ return false;
> >>+#endif
> >
> >I don't think this works in x86 because the local APIC irq_chip is the
> >root of the hierarchy; the rest of the irq_chips are children of it.
> >Furthermore, the delivery mode (i.e., regular interrupt vs NMI) is set in
> >these children controllers, either MSI or interrupt remapping.
> >
> >Instead, could it be possible to determine if NMI mode is supported by
> >checking if any of the parents of the irq_data supports NMI mode? If it
> >has to be the root, perhaps a flag can indicate irq_supports_nmi() about
> >this restriction.
> >
>
> I see, I'm not very sure how to deal with the semantics here. Does it make
> sense for an irqchip to be able to deliver NMIs if its parent doesn't?

I think it does not make sense.

>
> If we want to handle NMI in and irqchip hierachy, shouldn't all irqchips
> between the one delivering the NMI and the root irqchip also be able to
> deliver NMIs? (Or is it the other way around and an irqchip can deliver an
> NMI if all its children irqchip are also able to deliver an NMI?)

At least in APIC, all the irq_chips support NMI delivery. However only one
of them will generate the NMI interrupt message, the rest (e.g., the local
APIC) will only relay it to the CPU.

So yes, I agree that the whole hierarchy should support NMI delivery, but
the chip generating the NMI is not necessarily the root chip.

One question that I have is that whether the new flag IRQCHIP_SUPPORTS_NMI
should be added to irq_chips that only relay NMIs or only to those from
which an NMI can be requested.

>
> >>+ /* Don't support NMIs for chips behind a slow bus */
> >>+ if (d->chip->irq_bus_lock || d->chip->irq_bus_sync_unlock)
> >>+ return false;
> >>+
> >>+ return d->chip->flags & IRQCHIP_SUPPORTS_NMI;
> >>+}
> >>+
> >>+static int irq_nmi_setup(struct irq_desc *desc)
> >>+{
> >>+ struct irq_data *d = irq_desc_get_irq_data(desc);
> >>+ struct irq_chip *c = d->chip;
> >>+
> >>+ return c->irq_nmi_setup ? c->irq_nmi_setup(d) : -EINVAL;
> >
> >Continuing with my comment above, the child can decide whether to configure
> >NMI mode by itself or let it be done by a chip located higher in the
> >hierarchy. This is already done in other situations. For instance, in MSI
> >interrupts with interrupt remapping: the message data is populated by the
> >remapping controller.
> >
>
> Yes, if we want to handle the chip hierarchy, this will need modification.
>
> >>+}
> >>+
> >>+static void irq_nmi_teardown(struct irq_desc *desc)
> >>+{
> >>+ struct irq_data *d = irq_desc_get_irq_data(desc);
> >>+ struct irq_chip *c = d->chip;
> >>+
> >>+ if (c->irq_nmi_teardown)
> >>+ c->irq_nmi_teardown(d);
> >>+}
> >>+
> >> static int
> >> setup_irq_thread(struct irqaction *new, unsigned int irq, bool secondary)
> >> {
> >>@@ -1282,9 +1352,17 @@ static void irq_release_resources(struct irq_desc *desc)
> >> * fields must have IRQF_SHARED set and the bits which
> >> * set the trigger type must match. Also all must
> >> * agree on ONESHOT.
> >>+ * Interrupt lines used for NMIs cannot be shared.
> >> */
> >> unsigned int oldtype;
> >>+ if (desc->istate & IRQS_NMI) {
> >>+ pr_err("Invalid attempt to share NMI for %s (irq %d) on irqchip %s.\n",
> >>+ new->name, irq, desc->irq_data.chip->name);
> >>+ ret = -EINVAL;
> >>+ goto out_unlock;
> >>+ }
> >>+
> >> /*
> >> * If nobody did set the configuration before, inherit
> >> * the one provided by the requester.
> >>@@ -1733,6 +1811,59 @@ const void *free_irq(unsigned int irq, void *dev_id)
> >> }
> >> EXPORT_SYMBOL(free_irq);
> >>+/* This function must be called with desc->lock held */
> >>+static const void *__cleanup_nmi(unsigned int irq, struct irq_desc *desc)
> >>+{
> >>+ const char *devname = NULL;
> >>+
> >>+ desc->istate &= ~IRQS_NMI;
> >>+
> >>+ if (!WARN_ON(desc->action == NULL)) {
> >>+ irq_pm_remove_action(desc, desc->action);
> >>+ devname = desc->action->name;
> >>+ unregister_handler_proc(irq, desc->action);
> >>+
> >>+ kfree(desc->action);
> >>+ desc->action = NULL;
> >>+ }
> >>+
> >>+ irq_settings_clr_disable_unlazy(desc);
> >>+ irq_shutdown(desc);
> >>+
> >>+ irq_release_resources(desc);
> >>+
> >>+ irq_chip_pm_put(&desc->irq_data);
> >>+ module_put(desc->owner);
> >>+
> >>+ return devname;
> >>+}
> >>+
> >>+const void *free_nmi(unsigned int irq, void *dev_id)
> >>+{
> >>+ struct irq_desc *desc = irq_to_desc(irq);
> >>+ unsigned long flags;
> >>+ const void *devname;
> >>+
> >>+ if (!desc || WARN_ON(!(desc->istate & IRQS_NMI)))
> >>+ return NULL;
> >>+
> >>+ if (WARN_ON(irq_settings_is_per_cpu_devid(desc)))
> >>+ return NULL;
> >>+
> >>+ /* NMI still enabled */
> >>+ if (WARN_ON(desc->depth == 0))
> >>+ disable_nmi_nosync(irq);
> >>+
> >>+ raw_spin_lock_irqsave(&desc->lock, flags);
> >>+
> >>+ irq_nmi_teardown(desc);
> >>+ devname = __cleanup_nmi(irq, desc);
> >>+
> >>+ raw_spin_unlock_irqrestore(&desc->lock, flags);
> >>+
> >>+ return devname;
> >>+}
> >>+
> >> /**
> >> * request_threaded_irq - allocate an interrupt line
> >> * @irq: Interrupt line to allocate
> >>@@ -1902,6 +2033,101 @@ int request_any_context_irq(unsigned int irq, irq_handler_t handler,
> >> }
> >> EXPORT_SYMBOL_GPL(request_any_context_irq);
> >>+/**
> >>+ * request_nmi - allocate an interrupt line for NMI delivery
> >>+ * @irq: Interrupt line to allocate
> >>+ * @handler: Function to be called when the IRQ occurs.
> >>+ * Threaded handler for threaded interrupts.
> >>+ * @flags: Interrupt type flags
> >>+ * @name: An ascii name for the claiming device
> >>+ * @dev_id: A cookie passed back to the handler function
> >>+ *
> >>+ * This call allocates interrupt resources and enables the
> >>+ * interrupt line and IRQ handling. It sets up the IRQ line
> >>+ * to be handled as an NMI.
> >>+ *
> >>+ * An interrupt line delivering NMIs cannot be shared and IRQ handling
> >>+ * cannot be threaded.
> >>+ *
> >>+ * Interrupt lines requested for NMI delivering must produce per cpu
> >>+ * interrupts and have auto enabling setting disabled.
> >>+ *
> >>+ * Dev_id must be globally unique. Normally the address of the
> >>+ * device data structure is used as the cookie. Since the handler
> >>+ * receives this value it makes sense to use it.
> >>+ *
> >>+ * If the interrupt line cannot be used to deliver NMIs, function
> >>+ * will fail and return a negative value.
> >>+ */
> >>+int request_nmi(unsigned int irq, irq_handler_t handler,
> >>+ unsigned long irqflags, const char *name, void *dev_id)
> >>+{
> >>+ struct irqaction *action;
> >>+ struct irq_desc *desc;
> >>+ unsigned long flags;
> >>+ int retval;
> >>+
> >>+ if (irq == IRQ_NOTCONNECTED)
> >>+ return -ENOTCONN;
> >>+
> >>+ /* NMI cannot be shared, used for Polling */
> >>+ if (irqflags & (IRQF_SHARED | IRQF_COND_SUSPEND | IRQF_IRQPOLL))
> >>+ return -EINVAL;
> >>+
> >>+ if (!(irqflags & IRQF_PERCPU))
> >>+ return -EINVAL;
> >>+
> >>+ if (!handler)
> >>+ return -EINVAL;
> >
> >It would not be possible to use this handler in x86. All the NMI handlers are
> >put in a list populated by register_nmi_handler(). Could it be possible to
> >have a wrapper function that does not take a handler?
> >
>
> If it is just a matter of having a dummy handler, x86 NMI requesters could
> provide it (and you maintain your list on the side). I'm not sure there is a
> necessity for it to be in the core code.
>
> I guess the x86 flow handler for NMI simply doesn't make use of the action
> structure.
>
> I can add the dummy handler (and the request_nmi_no_handler) option, I'm
> just unsure it makes sense to include it in the irq core code. (I can always
> put it in a separate patch and see if it gets accepted).

Hmm, either alternatives would look ugly but I don't see a cleaner solution.
My suggestion would be to keep the handler as an argument of request_nmi()
and have callers to implement their dummy handlers. In this way, the core
code can be kept clean.

>
> >>+
> >>+ desc = irq_to_desc(irq);
> >>+
> >>+ if (!desc || irq_settings_can_autoenable(desc)
> >>+ || !irq_settings_can_request(desc)
> >
> >Is this a hard requirement for ARM? I guess it is, because you added
> >enable_nmi() for this very purpose. I could not find any irq in the x86
> >core code that uses the IRQ_NOAUTOEN flag.
> >
>
> On Arm64, we do use IRQ_NOAUTOEN and it is used for one of the interrupts
> we'd like to handle as NMI. So we do need IRQ_NOAUTOEN to be supported for
> NMIs.

I am not sure what would be the changes to add IRQ_NOAUTOEN to a specific
interrupt in my use case and whether these changes would be regarded as a
hack for a special case. I'll investigate and get back to you.

>
> However, the rationale for forcing it for NMIs was:
> - I wanted to avoid too many things being done automatically for NMIs, this
> way most of the responsability of managing the NMI is on the NMI requester
> rather than the core code.

But doesn't the NMI requester already have some control? It can control at
least the source of the NMIs (e.g., it can control when to start the timer
that would generate the NMI).

>
> - There would be a need to pass down the information that we are requesting
> an NMI to __setup_irq and do the proper setup when enabling depending on
> whether we have an NMI or not.
>
> I'm not completely against supporting autoenable for NMIs, but I'd like to
> have the opinion of others on the matter

+1

Thanks and BR,
Ricardo