Re: [PATCH 1/3] irq_work: Implement remote queueing

From: Peter Zijlstra
Date: Wed May 14 2014 - 07:54:20 EST


On Wed, May 14, 2014 at 01:38:14PM +0200, Frederic Weisbecker wrote:
> > > +bool irq_work_queue_on(struct irq_work *work, int cpu)
> > > +{
> > > + /* Only queue if not already pending */
> > > + if (!irq_work_claim(work))
> > > + return false;
> > > +
> > > + /* All work should have been flushed before going offline */
> > > + WARN_ON_ONCE(cpu_is_offline(cpu));
> >
> > WARN_ON_ONCE(in_nmi());
>
> Well... I think it's actually NMI-safe.

I don't think it is, most apic calls do apic_wait_icr_idle() then the
apic op, if an NMI happens in between and writes to the APIC, the return
context will see a !idle icr and fail.

This is why arch_irq_work_raise() again idles the icr after sending the
IPI.

Also, I think, seeing what benh said earlier, its unsafe for other archs
too.

> > > + llist_add(&work->llnode, &per_cpu(irq_work_list, cpu));
> > > + native_send_call_func_single_ipi(cpu);
> >
> > At the very leastestest make that:
> >
> > if (llist_add(&work->llnode, &per_cpu(irq_work_list, cpu)))
> > native_send_call_func_single_ipi(cpu);
>
> So yeah the issue is that we may have IRQ_WORK_LAZY in the queue. And
> if we have only such work in the queue, nobody has raised before us.
>
> So we can't just test with llist_add(). Or if we do, we must then
> separate raised and lazy list.

Then do the remote irq_work_raised thing. But it really stinks you broke
this very nice and simple thing.

> Also note that nohz is the only user for now and irq_work_claim() thus
> prevents from double IPI. Of course if more users come up the issue arise
> again.

DANGER, half arsed engineering at work, seriously? Just write proper
code already.

There's no fucking way the next user will check the implementation to
make sure its 'sane'.

> Maybe I was paranoid but I was worried about the overhead of printk() wakeups
> on boot if implemented with IPIs.
>
> Of course if I can be proven that it won't bring much damage to use an IPI, I'd
> be very happy to remove it.

That's the wrong fucking way around, first proof its needed then do
something about it. As is I think the LAZY thing is horrid.


Attachment: pgpfxLip115NI.pgp
Description: PGP signature