Re: [PATCH] printk: avoid livelock if another CPU printks continuously

From: Petr Mladek
Date: Thu Feb 11 2016 - 06:47:17 EST


On Thu 2016-02-11 17:21:12, Sergey Senozhatsky wrote:
> Hello,
> Thanks for Cc-ing, and sorry for long reply, I'm traveling now.
>
> On (02/10/16 11:25), Steven Rostedt wrote:
> > On Wed, 10 Feb 2016 17:10:16 +0100
> > Petr Mladek <pmladek@xxxxxxxx> wrote:
> >
> > > > Note, it's not that performance critical, and the loop only happens if
> > > > someone else is adding to the console, which hopefully, should be rare.
> > >
> > > I probably used too strong words. It is possible that the performance
> > > impact will not be critical. But the behavior is non-deterministic.
> > > I think that the approach taken by Jack is more promising.
> > > I mean the offloading of the console stuff to a workqueue.
> >
> > My worry about that is that it never comes out. The point about printk,
> > is that it should pretty much be guaranteed to print. If the system is
> > dying, and we push it off to a work queue, and that workqueue never
> > runs, then we lose critical data.
>
> correct, IIRC Jan agreed to switch to 'direct' (current behaviour) printk when
> one of the CPUs calls panic() (we still can use that approach even with
> workqueue based printk)
> http://marc.info/?l=linux-kernel&m=145200464309562

Yup.

> the other thing with workqueues based approach is that all of them can be 'blocked'
> in some OOM cases, so sort of fallback mechanism is also needed here
> http://marc.info/?l=linux-kernel&m=145251885502488

If this proves to be a problem. We could always use a workqueue with a
rescue worker.

Regarding the patch from Pan Xinhui. My main problem with it is that
it adds many handshakes and twists to the already complicated printk
code. Also it does not solve the problem if the flood of messages
comes entirely from an IRQ context.

Workqueues code is not trivial but mature. And the usage of the workqueues
in printk is quite straightforward.

Best Regards,
Petr