Re: [PATCH v3 1/4] printk/nmi: Generic solution for safe printk in NMI

From: Petr Mladek
Date: Thu Dec 10 2015 - 10:26:19 EST


On Wed 2015-12-09 15:50:07, Andrew Morton wrote:
> On Wed, 9 Dec 2015 14:21:02 +0100 Petr Mladek <pmladek@xxxxxxxx> wrote:
>
> > printk() takes some locks and could not be used a safe way in NMI
> > context.
> >
> > The chance of a deadlock is real especially when printing
> > stacks from all CPUs. This particular problem has been addressed
> > on x86 by the commit a9edc8809328 ("x86/nmi: Perform a safe NMI stack
> > trace on all CPUs").
> >
> > This patch reuses most of the code and makes it generic. It is
> > useful for all messages and architectures that support NMI.
> >
> > The patch is heavily based on the draft from Peter Zijlstra,
> > see https://lkml.org/lkml/2015/6/10/327
> >
>
> I guess this code is useful even on CONFIG_SMP=n: to avoid corruption
> of the printk internal structures whcih the problematic locking
> protects.

Yup and it is used even on CONFIG_SMP=n if I am not missing
something. At least, CONFIG_PRINTK_NMI stays enabled here.


> > +#define NMI_LOG_BUF_LEN (4096 - sizeof(atomic_t) - sizeof(struct irq_work))
> > +
> > +struct nmi_seq_buf {
> > + atomic_t len; /* length of written data */
> > + struct irq_work work; /* IRQ work that flushes the buffer */
> > + unsigned char buffer[NMI_LOG_BUF_LEN];
>
> When this buffer overflows, which characters get lost? Most recent or
> least recent?

The most recent messages are lost when the buffer overflows. The other
way would require to use a ring-buffer instead the seq_buf. We would need
a lock-less synchronization for both, begin and end, pointers. It
would add quite some complications.


> I'm not sure which is best, really. For an oops trace you probably
> want to preserve the least recent output: the stuff at the start of the
> output.

I agree. Fortunately, this is easier and it works this way.


> > +static void __printk_nmi_flush(struct irq_work *work)
> > +{
> > + static raw_spinlock_t read_lock =
> > + __RAW_SPIN_LOCK_INITIALIZER(read_lock);
> > + struct nmi_seq_buf *s = container_of(work, struct nmi_seq_buf, work);
> > + int len, size, i, last_i;
> > +
> > + /*
> > + * The lock has two functions. First, one reader has to flush all
> > + * available message to make the lockless synchronization with
> > + * writers easier. Second, we do not want to mix messages from
> > + * different CPUs. This is especially important when printing
> > + * a backtrace.
> > + */
> > + raw_spin_lock(&read_lock);
> > +
> > + i = 0;
> > +more:
> > + len = atomic_read(&s->len);
> > +
> > + /*
> > + * This is just a paranoid check that nobody has manipulated
> > + * the buffer an unexpected way. If we printed something then
> > + * @len must only increase.
> > + */
> > + WARN_ON(i && i >= len);
>
> hm, dumping a big backtrace in this context seems a poor idea. Oh
> well, shouldn't happen.

I see and the backtrace probably would not help much because "len"
might be manipulated also from NMI context. I am going to change
this to:

if (i && i >= len)
pr_err("printk_nmi_flush: internal error: i=%d >=
len=%d)\n", i, len);



> > + if (!len)
> > + goto out; /* Someone else has already flushed the buffer. */
> > +
> > + /* Make sure that data has been written up to the @len */
> > + smp_rmb();
> > +
> > + size = min_t(int, len, sizeof(s->buffer));
>
> len and size should have type size_t.

OK

> > --- /dev/null
> > +++ b/kernel/printk/printk.h
>
> I find it a bit irritating to have duplicated filenames. We could
> follow convention and call this "internal.h".

No problem. I am going to send an updated patchset soon.

Thanks a lot for review,
Petr
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/