Re: [PATCH 0/4 v6] Avoid softlockups in console_unlock()

From: Andrew Morton
Date: Thu Aug 22 2013 - 15:49:18 EST


On Thu, 22 Aug 2013 00:59:15 +0200 Jan Kara <jack@xxxxxxx> wrote:

> On Wed 21-08-13 14:27:23, Andrew Morton wrote:
> > On Wed, 21 Aug 2013 10:08:28 +0200 Jan Kara <jack@xxxxxxx> wrote:
> >
> > > These patches avoid softlockups when a CPU gets caught in console_unlock() for
> > > a long time during heavy printing from other CPU. As is discussed in patch 3/4
> > > it isn't enough to just silence the watchdog because if CPU spends too long in
> > > console_unlock() also RCU will complain, other CPUs can be blocked waiting for
> > > printing CPU to process IPI, and even disk can be offlined because commands
> > > couldn't be delivered to it for too long.
> > >
> > > This patch series solves the problem by stopping printing in console_unlock()
> > > after 1000 characters and the printing is postponed to irq work. To avoid
> > > hogging a single CPU (irq work gets processed on the same CPU where it was
> > > queued so it doesn't really help to reduce the printing load on that CPU) we
> > > introduce a new type of lazy irq work - IRQ_WORK_UNBOUND - which can be
> > > processed by any CPU.
> >
> > I still hate the patchset :(
> >
> > Remind us why we need this? Whose kernel is spewing so much logging and
> > why?
> We have customers (quite a few of them actually) which have machines with
> lots of SCSI disks attached (due to multipath etc.) and during boot when
> these disks are discovered and partitions set up quite some printing
> happens - multiplied by the number of devices (1000+) it is too much for a
> serial console to handle quickly enough. So these machines aren't able to
> boot with serial console enabled.

It sounds like rather a corner case, not worth mucking up the critical
core logging code.

Desperately seeking alternatives...

I suppose there's some reason why we can't just make those drivers shut
up? If the messages are in the log buffer but aren't displayed,
they're still accessible after boot?

Or how about passing those messages over to a kernel thread, to be
printed out at a lower rate? A linked list and schedule_work() would
suffice.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/