Re: [PATCH 3/3] ring-buffer: make cpu buffer entries counter atomic

From: Frederic Weisbecker
Date: Fri May 01 2009 - 18:15:41 EST


On Fri, May 01, 2009 at 10:28:13AM -0400, Steven Rostedt wrote:
>
>
>
> On Fri, 1 May 2009, Frederic Weisbecker wrote:
>
> > On Fri, May 01, 2009 at 01:50:47PM +0200, Ingo Molnar wrote:
> > >
> > > * Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> > >
> > > > From: Steven Rostedt <srostedt@xxxxxxxxxx>
> > > >
> > > > The entries counter in cpu buffer is not atomic. Although it only
> > > > gets updated by a single CPU, interrupts may come in and update
> > > > the counter too. This would cause missing entries to be added.
> > >
> > > > - unsigned long entries;
> > > > + atomic_t entries;
> > >
> > > Hm, that's not really good as atomics can be rather expensive and
> > > this is the fastpath.
> > >
> > > This is the upteenth time or so that the fact that we do not disable
> > > irqs while generating trace entries bites us in one way or another.
> > > IRQs can come in and confuse function trace output, etc. etc.
> > >
> > > Please lets do what i suggested a long time ago: disable irqs _once_
> > > in any trace point and run atomically from that point on, and enable
> > > them once, at the end.
> > >
> > > The cost is very small and it turns into a win immediately by
> > > elimination of a _single_ atomic instruction. (even on Nehalem they
> > > cost 20 cycles. More on older CPUs.) We can drop the preempt-count
> > > disable/enable as well and a lot of racy code as well. Please.
> > >
> > > Ingo
> >
> >
> > I also suspect one other good effect on doing this.
> >
> > As you know, between a lock_reserve and a discard, several interrupts
> > can trigger some traces. It means that if some rooms have already been
> > reserved, the discard will really create a discarded entry and we
> > can't reuse it.
> >
> > For example in the case of filters with lock tracing, we rapidly run
> > into entries overriden, making the lock events tracing about useless
> > because we rapidly lose everything.
>
> If you want, we can disable interrupts from the event tracer, not the ring
> buffer.
>
> We would have to go back to the original ring buffer code that passed in
> flags.



Or may be just copy the entry on the stack and filter on it,
then only reserve if it is eligible.



> >
> > At least that's an effect I observed. I'm not sure the discard is the
> > real cause but it seems to make sense.
> >
> > That's a pity because believe me it is very useful to hunt a softlockup.
> >
> > Of course it doesn't prevent from NMI tempest, but we already have
> > protections for that.
>
> If we do not allow interrupts to be traced, we can not allow NMIs either.
> If we do not let the ring buffer be re-entrant, then we will not be able
> to trace any NMI (to be safe).
>
> Going this route, there would be no need to make a lockless ring buffer
> either.


Hm that's really not what we want, indeed :-)
Just forget about what I said.


>
> -- Steve
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/