Re: [PATCH v4 2/2] printk: console: Support console-specific loglevels

From: Petr Mladek
Date: Tue Apr 25 2023 - 10:57:59 EST


On Thu 2023-04-20 13:39:32, Chris Down wrote:
> Consoles can have vastly different latencies and throughputs. For
> example, writing a message to the serial console can take on the order
> of tens of milliseconds to get the UART to successfully write a message.
> While this might be fine for a single, one-off message, this can cause
> significant application-level stalls in situations where the kernel
> writes large amounts of information to the console.
>
> This means that while you might want to send at least INFO level
> messages to (for example) netconsole, which is relatively fast, you may
> only want to send at least WARN level messages to the serial console.
> Such an implementation would permit debugging using the serial console
> in cases that netconsole doesn't receive messages during particularly
> bad system issues, while still keeping the noise low enough to avoid
> inducing latency in userspace applications. This patch adds such an
> interface, extending the existing console loglevel controls to allow
> each console to have its own loglevel.
>
> One can't just disable the serial console, because one may actually need
> it in situations where the machine is in a bad enough state that nothing
> is received on netconsole. One also can't just bump the loglevel at
> runtime after the issue, because usually the machine is already so
> wedged by this point that it isn't responsive to such requests.
>
> In terms of technical implementation, this patch embeds a device pointer
> in the console struct, and registers each console using it so we can
> expose attributes in sysfs. We currently expose the following
> attributes:
>
> % ls -l /sys/class/console/ttyS/
> total 0
> lrwxrwxrwx 1 root root 0 Jul 20 17:37 subsystem -> ../../../../class/console/
> -r--r--r-- 1 root root 4096 Jul 20 17:38 effective_loglevel
> -r--r--r-- 1 root root 4096 Jul 20 17:38 effective_loglevel_source
> -r--r--r-- 1 root root 4096 Jul 20 17:38 enabled
> -rw-r--r-- 1 root root 4096 Jul 20 17:38 loglevel
> -rw-r--r-- 1 root root 4096 Jul 20 17:37 uevent
>
> The lifecycle of this classdev looks like this on registration:
>
> register_console(con)/printk_late_init()
> console_register_device(con)
> device_initialize(con->classdev) # refcount++
> device_add(con->classdev) # refcount++
>
> At stable state, the refcount is two.
>
> Console unregistration looks like this:
>
> [con->classdev refcount drops to 0]
> console_classdev_release(con->classdev)
> kfree(con->classdev)
>
> unregister_console(con)
> device_unregister(con->classdev)
> device_del(con->classdev) # refcount--
> device_remove_class_symlinks()
> kernfs_remove_by_name_ns()
> kernfs_drain()
> kernfs_drain_open_files() # wait for close()
> put_device(con->classdev) # refcount--
>
> We also deprecate the kernel.printk sysctl as it doesn't know about
> per-console loglevels, and is generally pretty confusing.
>
> For information on the precedence and application of the new controls,
> see Documentation/ABI/testing/sysfs-class-console and
> Documentation/admin-guide/per-console-loglevel.rst.
>
> Signed-off-by: Chris Down <chris@xxxxxxxxxxxxxx>
> ---
> Documentation/ABI/testing/sysfs-class-console | 47 ++
> Documentation/admin-guide/index.rst | 1 +
> .../admin-guide/kernel-parameters.txt | 28 +-
> .../admin-guide/per-console-loglevel.rst | 92 ++++
> Documentation/admin-guide/serial-console.rst | 17 +-
> Documentation/core-api/printk-basics.rst | 35 +-
> Documentation/networking/netconsole.rst | 17 +
> MAINTAINERS | 3 +
> drivers/tty/sysrq.c | 15 +
> include/linux/console.h | 3 +
> include/linux/printk.h | 9 +
> kernel/printk/console_cmdline.h | 2 +
> kernel/printk/printk.c | 410 +++++++++++++++++-
> kernel/printk/sysctl.c | 57 ++-
> 14 files changed, 688 insertions(+), 48 deletions(-)
> create mode 100644 Documentation/ABI/testing/sysfs-class-console
> create mode 100644 Documentation/admin-guide/per-console-loglevel.rst
>
> diff --git a/Documentation/ABI/testing/sysfs-class-console b/Documentation/ABI/testing/sysfs-class-console
> new file mode 100644
> index 000000000000..35a117d521cf
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-class-console
> @@ -0,0 +1,47 @@
> +What: /sys/class/console/
> +Date: April 2023
> +Contact: Chris Down <chris@xxxxxxxxxxxxxx>
> +Description: Interface for viewing and setting per-console attributes, like
> + the per-console loglevel. For a high-level document describing
> + the motivations for this interface and related non-sysfs
> + controls, see
> + Documentation/admin-guide/per-console-loglevel.rst.
> +
> +What: /sys/class/console/<C>/effective_loglevel
> +Date: April 2023
> +Contact: Chris Down <chris@xxxxxxxxxxxxxx>
> +Description: Read only. The currently effective loglevel for this console.
> + All messages emitted with a loglevel below the effective value
> + will be emitted to the console.
> +
> +What: /sys/class/console/<C>/effective_loglevel_source
> +Date: April 2023
> +Contact: Chris Down <chris@xxxxxxxxxxxxxx>
> +Description: Read only. The currently effective loglevel source for this
> + console -- for example, whether it was set globally, or whether
> + it was set locally for this console.
> +
> + Possible values are:
> + =============== ============================================
> + local The loglevel comes from the per-console
> + loglevel.
> + global The loglevel comes from the global loglevel.
> + ignore_loglevel Both the per-console loglevel and global
> + loglevels are ignored as ignore_loglevel is
> + present on the kernel command line.
> + =============== ============================================
> +
> +What: /sys/class/console/<C>/enabled
> +Date: April 2023
> +Contact: Chris Down <chris@xxxxxxxxxxxxxx>
> +Description: Read only. "1" if the console is enabled, "0" otherwise.
> +
> +What: /sys/class/console/<C>/loglevel
> +Date: April 2023
> +Contact: Chris Down <chris@xxxxxxxxxxxxxx>
> +Description: Read write. The current per-console loglevel, which will take
> + effect if not overridden by other non-sysfs controls (see
> + Documentation/admin-guide/per-console-loglevel.rst). Bounds are
> + 0 (LOGLEVEL_EMERG) to 8 (LOGLEVEL_DEBUG + 1) inclusive. Also
> + takes the special value "-1" to indicate that no per-console
> + loglevel is set, and we should defer to the global controls.
> diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst
> index 09a563bbe3e7..875845b02556 100644
> --- a/Documentation/admin-guide/index.rst
> +++ b/Documentation/admin-guide/index.rst
> @@ -116,6 +116,7 @@ configure specific aspects of kernel behavior to your liking.
> namespaces/index
> numastat
> parport
> + per-console-loglevel
> perf-security
> pm/index
> pnp
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 7016cb12dc4e..902f018a5c27 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -722,13 +722,18 @@
> ttyS<n>[,options]
> ttyUSB0[,options]
> Use the specified serial port. The options are of
> - the form "bbbbpnf", where "bbbb" is the baud rate,
> - "p" is parity ("n", "o", or "e"), "n" is number of
> - bits, and "f" is flow control ("r" for RTS or
> - omit it). Default is "9600n8".
> + the form "bbbbpnf,extra", where "bbbb" is the baud
> + rate, "p" is parity ("n", "o", or "e"), "n" is
> + number of bits, and "f" is flow control ("r" for RTS
> + or omit it). Default is "9600n8".
>
> - See Documentation/admin-guide/serial-console.rst for more
> - information. See
> + At present the only extra option is "loglevel" to
> + set the per-console loglevel. For example:
> +
> + console=ttyS0,9600n8,loglevel:3
> +
> + See Documentation/admin-guide/serial-console.rst for
> + more information. See
> Documentation/networking/netconsole.rst for an
> alternative.
>
> @@ -2880,10 +2885,13 @@
> logibm.irq= [HW,MOUSE] Logitech Bus Mouse Driver
> Format: <irq>
>
> - loglevel= All Kernel Messages with a loglevel smaller than the
> - console loglevel will be printed to the console. It can
> - also be changed with klogd or other programs. The
> - loglevels are defined as follows:
> + loglevel= Sets the global loglevel. All messages with a loglevel
> + smaller than the console loglevel will be printed to
> + the console. Note that this can be overridden
> + per-console, see
> + Documentation/admin-guide/per-console-loglevel.rst.
> +
> + The loglevels are defined as follows:
>
> 0 (KERN_EMERG) system is unusable
> 1 (KERN_ALERT) action must be taken immediately
> diff --git a/Documentation/admin-guide/per-console-loglevel.rst b/Documentation/admin-guide/per-console-loglevel.rst
> new file mode 100644
> index 000000000000..4a7ecce7ba8a
> --- /dev/null
> +++ b/Documentation/admin-guide/per-console-loglevel.rst
> @@ -0,0 +1,92 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +.. _per_console_loglevel:
> +
> +Per-console loglevel support
> +============================
> +
> +Motivation
> +----------
> +
> +Consoles can have vastly different latencies and throughputs. For example,
> +writing a message to the serial console can take on the order of tens of
> +milliseconds to get the UART to successfully write a message. While this might
> +be fine for a single, one-off message, this can cause significant
> +application-level stalls in situations where the kernel writes large amounts of
> +information to the console.
> +
> +This means that while you might want to send at least INFO level messages to
> +(for example) netconsole, which is relatively fast, you may only want to send
> +at least WARN level messages to the serial console. This permits debugging
> +using the serial console in cases that netconsole doesn't receive messages
> +during particularly bad system issues, while still keeping the noise low enough
> +to avoid inducing latency in userspace applications.
> +
> +Tunables
> +--------
> +
> +In order to allow tuning this, the following controls exist:
> +
> +Global
> +~~~~~~
> +
> +The global loglevel is set by the ``kernel.console_loglevel`` sysctl, which can
> +also be set as ``loglevel=`` on the kernel command line.
> +
> +The printk module also takes two parameters which modify this behaviour
> +further:
> +
> +* ``ignore_loglevel`` on the kernel command line or set in printk parameters:
> + Emit all messages. All other controls are ignored if this is present.
> +* ``ignore_per_console_loglevel`` on the kernel command line or set in printk
> + parameters: Ignore all per-console loglevels and use the global loglevel.
> +
> +The default value for ``kernel.console_loglevel`` comes from
> +``CONFIG_CONSOLE_LOGLEVEL_DEFAULT``, or ``CONFIG_CONSOLE_LOGLEVEL_QUIET`` if
> +``quiet`` is passed on the kernel command line.
> +
> +Console attributes
> +~~~~~~~~~~~~~~~~~~
> +
> +Registered consoles are exposed at ``/sys/class/console``. For example, if you
> +are using ``ttyS0``, the console backing it can be viewed at
> +``/sys/class/console/ttyS0/``. The following files are available:
> +
> +* ``effective_loglevel`` (r): The effective loglevel after considering all
> + loglevel authorities. For example, if the console-specific loglevel is 3, but
> + the global minimum console loglevel [*]_ is 5, then the value will be 5.

I would rather avoid this example. It confused me ;-)

I thought that it was talking about the "global console loglevel"
instead of "global minimum console loglevel". I was surprised
that the global value was used when a console-specific one
was defined.

</more details about my confusion>
The console loglevel defines a maximal message loglevel.
The important thing is that it defines a "limit". Most people,
including me, do not remember if the numbers go up or down.
So I ignored the word "minimum". I though that it described
the effect of the value and not a limit for the value.

Also the global minimum loglevel is a really weird setting.
I do not see much practical use for it. I guess that most
people are not aware of this limit. I think that it creates
more harm than good.
</more details about my confusion>

My proposal. I would replace the "confusing" paragraph with
something like:

* ``effective_loglevel`` (r): The effective loglevel after considering all
loglevel authorities. For example, it would show the value of
the console-specific loglevel when it is defined or the global
console loglevel value when the console-specific one is not defined.

> +* ``effective_loglevel_source`` (r): The loglevel authority which resulted in
> + the effective loglevel being set. The following values can be present:
> +
> + * ``local``: The console-specific loglevel is in effect.
> + * ``global``: The global loglevel (``kernel.console_loglevel``) is in
> + effect. Set a console-specific loglevel to override it.
> + * ``ignore_loglevel``: ``ignore_loglevel`` was specified on the kernel
> + command line or at ``/sys/module/printk/parameters/ignore_loglevel``.
> + Disable it to use level controls.
> + * ``ignore_per_console_loglevel``: ``ignore_per_console_loglevel`` was
> + specified on the kernel command line or at
> + ``/sys/module/printk/parameters/ignore_per_console_loglevel``. Disable it
> + to use per-console level controls.
> +
> +* ``enabled`` (r): Whether the console is enabled.
> +* ``loglevel`` (rw): The local, console-specific loglevel for this console.
> + This will be in effect if no other global control overrides it. Look at
> + ``effective_loglevel`` and ``effective_loglevel_source`` to verify that.
> +
> +.. [*] The existence of a minimum console loglevel is generally considered to
> + be a confusing and rarely used interface, and as such is not exposed through
> + the modern printk sysctl APIs that obsoleted ``kernel.printk``. Use the
> + legacy ``kernel.printk`` sysctl to control it if you have a rare use case
> + that requires changing it. The default value is ``CONSOLE_LOGLEVEL_MIN``.
> +
> +Deprecated
> +~~~~~~~~~~
> +
> +* ``kernel.printk`` sysctl: this takes four values, setting
> + ``kernel.console_loglevel``, ``kernel.default_message_loglevel``, the minimum
> + console loglevel, and a fourth unused value. The interface is generally
> + considered to quite confusing, doesn't perform checks on the values given,
> + and is unaware of per-console loglevel semantics.
> +
> +Chris Down <chris@xxxxxxxxxxxxxx>, 27-April-2023
> diff --git a/Documentation/admin-guide/serial-console.rst b/Documentation/admin-guide/serial-console.rst
> index 58b32832e50a..4e204115fe4a 100644
> --- a/Documentation/admin-guide/serial-console.rst
> +++ b/Documentation/admin-guide/serial-console.rst
> @@ -32,15 +32,25 @@ The format of this option is::
> and F is flow control ('r' for RTS). Default is
> 9600n8. The maximum baudrate is 115200.
>
> + One can also specify the per-console loglevel for this
> + console by providing a loglevel parameter, for example
> + "loglevel:4" to set this console's loglevel to 4. The
> + value provided can be from 0 (LOGLEVEL_EMERG) to 8
> + (LOGLEVEL_DEBUG + 1), and messages below that will be
> + emitted onto the console as they become available.
> +
> You can specify multiple console= options on the kernel command line.
> Output will appear on all of them. The last device will be used when
> you open ``/dev/console``. So, for example::
>
> - console=ttyS1,9600 console=tty0
> + console=ttyS1,9600,loglevel:5 console=tty0
>
> defines that opening ``/dev/console`` will get you the current foreground
> -virtual console, and kernel messages will appear on both the VGA
> -console and the 2nd serial port (ttyS1 or COM2) at 9600 baud.
> +virtual console, and kernel messages will appear on both the VGA console and
> +the 2nd serial port (ttyS1 or COM2) at 9600 baud. The optional loglevel "5"
> +indicates that this console will emit messages more serious than
> +LOGLEVEL_NOTICE (that is, LOGLEVEL_WARNING and below, since more serious
> +messages have lower ordering).
>
> Note that you can only define one console per device type (serial, video).
>
> @@ -113,3 +123,4 @@ Replace the sample values as needed.
> the integration of these patches into m68k, ppc and alpha.
>
> Miquel van Smoorenburg <miquels@xxxxxxxxxx>, 11-Jun-2000
> +Chris Down <chris@xxxxxxxxxxxxxx>, 27-April-2023
> diff --git a/Documentation/core-api/printk-basics.rst b/Documentation/core-api/printk-basics.rst
> index 2dde24ca7d9f..bfad359505bb 100644
> --- a/Documentation/core-api/printk-basics.rst
> +++ b/Documentation/core-api/printk-basics.rst
> @@ -54,32 +54,33 @@ string, the log level is not a separate argument). The available log levels are:
>
> The log level specifies the importance of a message. The kernel decides whether
> to show the message immediately (printing it to the current console) depending
> -on its log level and the current *console_loglevel* (a kernel variable). If the
> -message priority is higher (lower log level value) than the *console_loglevel*
> -the message will be printed to the console.
> +on its log level and the current global *console_loglevel* or local per-console
> +loglevel (kernel variables). If the message priority is higher (lower log level
> +value) than the effective loglevel the message will be printed to the console.
>
> If the log level is omitted, the message is printed with ``KERN_DEFAULT``
> level.
>
> -You can check the current *console_loglevel* with::
> +You can check the current console's loglevel -- for example if you want to
> +check the loglevel for serial consoles:
>
> - $ cat /proc/sys/kernel/printk
> - 4 4 1 7
> + $ cat /sys/class/console/ttyS0/effective_loglevel
> + 6
> + $ cat /sys/class/console/ttyS0/effective_loglevel_source
> + local
>
> -The result shows the *current*, *default*, *minimum* and *boot-time-default* log
> -levels.
> +To change the default loglevel for all consoles, simply write the desired level
> +to ``/proc/sys/kernel/console_loglevel``. For example::
>
> -To change the current console_loglevel simply write the desired level to
> -``/proc/sys/kernel/printk``. For example, to print all messages to the console::
> + # echo 5 > /proc/sys/kernel/console_loglevel
>
> - # echo 8 > /proc/sys/kernel/printk
> +This sets the console_loglevel to print KERN_WARNING (4) or more severe
> +messages to console. Consoles with a per-console loglevel set will ignore it
> +unless ``ignore_per_console_loglevel`` is set on the kernel command line or at
> +``/sys/module/printk/parameters/ignore_per_console_loglevel``.
>
> -Another way, using ``dmesg``::
> -
> - # dmesg -n 5
> -
> -sets the console_loglevel to print KERN_WARNING (4) or more severe messages to
> -console. See ``dmesg(1)`` for more information.
> +For more information on per-console loglevels, see
> +Documentation/admin-guide/per-console-loglevel.rst.
>
> As an alternative to printk() you can use the ``pr_*()`` aliases for
> logging. This family of macros embed the log level in the macro names. For
> diff --git a/Documentation/networking/netconsole.rst b/Documentation/networking/netconsole.rst
> index dd0518e002f6..e2c094a96f16 100644
> --- a/Documentation/networking/netconsole.rst
> +++ b/Documentation/networking/netconsole.rst
> @@ -67,6 +67,23 @@ Built-in netconsole starts immediately after the TCP stack is
> initialized and attempts to bring up the supplied dev at the supplied
> address.
>
> +You can also set a loglevel at boot time on the kernel command line::
> +
> + console=netcon0,loglevel:2
> +
> +This can also be changed at runtime::
> +
> + $ ls -l /sys/class/console/netcon0/
> + total 0
> + lrwxrwxrwx 1 root root 0 May 18 13:28 subsystem -> ../../../../class/console/
> + -r--r--r-- 1 root root 4096 May 18 13:28 effective_loglevel
> + -r--r--r-- 1 root root 4096 May 18 13:28 effective_loglevel_source
> + -r--r--r-- 1 root root 4096 May 18 13:28 enabled
> + -rw-r--r-- 1 root root 4096 May 18 13:28 loglevel
> + -rw-r--r-- 1 root root 4096 May 18 13:28 uevent
> +
> +See Documentation/admin-guide/per-console-loglevel.rst for more information.
> +
> The remote host has several options to receive the kernel messages,
> for example:
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 0e64787aace8..74a9a31264d2 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -16808,6 +16808,9 @@ R: Steven Rostedt <rostedt@xxxxxxxxxxx>
> R: John Ogness <john.ogness@xxxxxxxxxxxxx>
> S: Maintained
> T: git git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux.git
> +F: Documentation/ABI/testing/sysfs-class-console
> +F: Documentation/admin-guide/per-console-loglevel.rst
> +F: Documentation/core-api/printk-basics.rst
> F: include/linux/printk.h
> F: kernel/printk/
>
> diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
> index b6e70c5cfa17..f62beafd5d49 100644
> --- a/drivers/tty/sysrq.c
> +++ b/drivers/tty/sysrq.c
> @@ -51,6 +51,7 @@
> #include <linux/syscalls.h>
> #include <linux/of.h>
> #include <linux/rcupdate.h>
> +#include <linux/console.h>
>
> #include <asm/ptrace.h>
> #include <asm/irq_regs.h>
> @@ -101,12 +102,26 @@ __setup("sysrq_always_enabled", sysrq_always_enabled_setup);
> static void sysrq_handle_loglevel(int key)
> {
> int i;
> + int cookie;
> + int warned = 0;
> + struct console *con;
>
> i = key - '0';
> console_loglevel = CONSOLE_LOGLEVEL_DEFAULT;
> pr_info("Loglevel set to %d\n", i);
> console_loglevel = i;
> +
> + cookie = console_srcu_read_lock();
> + for_each_console_srcu(con) {
> + if (!warned && per_console_loglevel_is_set(con)) {
> + warned = 1;
> + pr_warn("Overriding per-console loglevel from sysrq\n");
> + }
> + con->level = -1;
> + }
> + console_srcu_read_unlock(cookie);
> }
> +
> static const struct sysrq_key_op sysrq_loglevel_op = {
> .handler = sysrq_handle_loglevel,
> .help_msg = "loglevel(0-9)",
> diff --git a/include/linux/console.h b/include/linux/console.h
> index d3195664baa5..77052f863571 100644
> --- a/include/linux/console.h
> +++ b/include/linux/console.h
> @@ -17,6 +17,7 @@
> #include <linux/atomic.h>
> #include <linux/bits.h>
> #include <linux/rculist.h>
> +#include <linux/device.h>
> #include <linux/types.h>
>
> struct vc_data;
> @@ -203,6 +204,8 @@ struct console {
> unsigned long dropped;
> void *data;
> struct hlist_node node;
> + int level;
> + struct device *classdev;
> };
>
> #ifdef CONFIG_LOCKDEP
> diff --git a/include/linux/printk.h b/include/linux/printk.h
> index 8ef499ab3c1e..25ba0873dee8 100644
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> @@ -69,6 +69,8 @@ extern int console_printk[];
>
> extern void console_verbose(void);
>
> +int clamp_loglevel(int level);
> +
> /* strlen("ratelimit") + 1 */
> #define DEVKMSG_STR_MAX_SIZE 10
> extern char devkmsg_log_str[];
> @@ -139,6 +141,7 @@ void early_printk(const char *s, ...) { }
> #endif
>
> struct dev_printk_info;
> +struct console;
>
> #ifdef CONFIG_PRINTK
> asmlinkage __printf(4, 0)
> @@ -192,6 +195,8 @@ void show_regs_print_info(const char *log_lvl);
> extern asmlinkage void dump_stack_lvl(const char *log_lvl) __cold;
> extern asmlinkage void dump_stack(void) __cold;
> void printk_trigger_flush(void);
> +
> +bool per_console_loglevel_is_set(const struct console *con);
> #else
> static inline __printf(1, 0)
> int vprintk(const char *s, va_list args)
> @@ -271,6 +276,10 @@ static inline void dump_stack(void)
> static inline void printk_trigger_flush(void)
> {
> }
> +static inline bool per_console_loglevel_is_set(const struct console *con)
> +{
> + return false;
> +}
> #endif
>
> #ifdef CONFIG_SMP
> diff --git a/kernel/printk/console_cmdline.h b/kernel/printk/console_cmdline.h
> index 3ca74ad391d6..40f1a1ff0965 100644
> --- a/kernel/printk/console_cmdline.h
> +++ b/kernel/printk/console_cmdline.h
> @@ -6,6 +6,8 @@ struct console_cmdline
> {
> char name[16]; /* Name of the driver */
> int index; /* Minor dev. to use */
> + int level; /* Log level to use */
> + short flags; /* Initial flags */
> bool user_specified; /* Specified by command line vs. platform */
> char *options; /* Options for the driver */
> #ifdef CONFIG_A11Y_BRAILLE_CONSOLE
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 06f16a5f1516..dbc686dfe15e 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -44,6 +44,7 @@
> #include <linux/irq_work.h>
> #include <linux/ctype.h>
> #include <linux/uio.h>
> +#include <linux/device.h>
> #include <linux/sched/clock.h>
> #include <linux/sched/debug.h>
> #include <linux/sched/task_stack.h>
> @@ -199,6 +200,12 @@ static int __init control_devkmsg(char *str)
> }
> __setup("printk.devkmsg=", control_devkmsg);
>
> +int clamp_loglevel(int level)
> +{
> + return clamp(level, minimum_console_loglevel,
> + CONSOLE_LOGLEVEL_MOTORMOUTH);
> +}
> +
> char devkmsg_log_str[DEVKMSG_STR_MAX_SIZE] = "ratelimit";
> #if defined(CONFIG_PRINTK) && defined(CONFIG_SYSCTL)
> int devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write,
> @@ -466,6 +473,14 @@ static struct latched_seq clear_seq = {
> .val[1] = 0,
> };
>
> +static struct class *console_class;
> +
> +enum loglevel_source {
> + LLS_GLOBAL,
> + LLS_LOCAL,
> + LLS_IGNORE_LOGLEVEL,
> +};
> +
> #define LOG_LEVEL(v) ((v) & 0x07)
> #define LOG_FACILITY(v) ((v) >> 3 & 0xff)
>
> @@ -696,7 +711,8 @@ static ssize_t msg_print_ext_body(char *buf, size_t size,
> return len;
> }
>
> -static bool printk_get_next_message(struct printk_message *pmsg, u64 seq,
> +static bool printk_get_next_message(struct printk_message *pmsg,
> + struct console *con, u64 seq,
> bool is_extended, bool may_supress);
>
> /* /dev/kmsg - userspace message inject/listen interface */
> @@ -799,7 +815,8 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf,
> if (ret)
> return ret;
>
> - if (!printk_get_next_message(&pmsg, atomic64_read(&user->seq), true, false)) {
> + if (!printk_get_next_message(&pmsg, NULL, atomic64_read(&user->seq),
> + true, false)) {
> if (file->f_flags & O_NONBLOCK) {
> ret = -EAGAIN;
> goto out;
> @@ -816,8 +833,8 @@ static ssize_t devkmsg_read(struct file *file, char __user *buf,
> * This pairs with __wake_up_klogd:A.
> */
> ret = wait_event_interruptible(log_wait,
> - printk_get_next_message(&pmsg, atomic64_read(&user->seq), true,
> - false)); /* LMM(devkmsg_read:A) */
> + printk_get_next_message(&pmsg, NULL, atomic64_read(&user->seq),
> + true, false)); /* LMM(devkmsg_read:A) */
> if (ret)
> goto out;
> }
> @@ -1261,9 +1278,119 @@ module_param(ignore_loglevel, bool, S_IRUGO | S_IWUSR);
> MODULE_PARM_DESC(ignore_loglevel,
> "ignore loglevel setting (prints all kernel messages to the console)");
>
> -static bool suppress_message_printing(int level)
> +static bool __read_mostly ignore_per_console_loglevel;
> +
> +static int __init ignore_per_console_loglevel_setup(char *str)
> +{
> + ignore_per_console_loglevel = true;
> + return 0;
> +}
> +
> +early_param("ignore_per_console_loglevel", ignore_per_console_loglevel_setup);
> +module_param(ignore_per_console_loglevel, bool, 0644);
> +MODULE_PARM_DESC(
> + ignore_per_console_loglevel,
> + "ignore per-console loglevel setting (only respect global console loglevel)");
> +
> +bool per_console_loglevel_is_set(const struct console *con)
> +{
> + return !ignore_per_console_loglevel && con && (con->level > 0);
> +}
> +
> +/*
> + * Hierarchy of loglevel authority:
> + *
> + * 1. con->level. The locally set, console-specific loglevel. Optional, only
> + * valid if >0.
> + * 2. console_loglevel. The default global console loglevel, always present.
> + *
> + * The behaviour can be further changed by the following printk module
> + * parameters:
> + *
> + * 1. ignore_loglevel. Can be set at boot or at runtime with
> + * /sys/module/printk/parameters/ignore_loglevel. Overrides absolutely
> + * everything since it's used to debug.
> + * 2. ignore_per_console_loglevel. Existing per-console loglevel values are left
> + * intact, but are ignored in favour of console_loglevel as long as this is
> + * true. Also manipulated through syslog(SYSLOG_ACTION_CONSOLE_{ON,OFF}).
> + */
> +static enum loglevel_source
> +console_effective_loglevel_source(const struct console *con)
> +{
> + if (WARN_ON_ONCE(!con))
> + return LLS_GLOBAL;
> +
> + if (ignore_loglevel)
> + return LLS_IGNORE_LOGLEVEL;
> +
> + if (per_console_loglevel_is_set(con))
> + return LLS_LOCAL;
> +
> + return LLS_GLOBAL;
> +}
> +
> +static int console_effective_loglevel(const struct console *con)
> {
> - return (level >= console_loglevel && !ignore_loglevel);
> + enum loglevel_source source;
> + int level;
> +
> + if (WARN_ON_ONCE(!con))
> + return default_console_loglevel;
> +
> + source = console_effective_loglevel_source(con);
> +
> + switch (source) {
> + case LLS_IGNORE_LOGLEVEL:
> + level = CONSOLE_LOGLEVEL_MOTORMOUTH;
> + break;
> + case LLS_LOCAL:
> + level = con->level;
> + break;
> + case LLS_GLOBAL:
> + level = console_loglevel;
> + break;
> + default:
> + pr_warn("Unhandled console loglevel source: %d", source);
> + level = default_console_loglevel;
> + break;
> + }
> +
> + return level;
> +}
> +
> +static const char *
> +console_effective_loglevel_source_str(const struct console *con)
> +{
> + enum loglevel_source source;
> + const char *str;
> +
> + if (WARN_ON_ONCE(!con))
> + return "unknown";
> +
> + source = console_effective_loglevel_source(con);
> +
> + switch (source) {
> + case LLS_IGNORE_LOGLEVEL:
> + str = "ignore_loglevel";
> + break;
> + case LLS_LOCAL:
> + str = "local";
> + break;
> + case LLS_GLOBAL:
> + str = "global";
> + break;
> + default:
> + pr_warn("Unhandled console loglevel source: %d", source);
> + str = "unknown";
> + break;
> + }
> +
> + return str;
> +}
> +
> +static bool suppress_message_printing(int level, struct console *con)
> +{
> + return level >= console_effective_loglevel(con);
> }
>
> #ifdef CONFIG_BOOT_PRINTK_DELAY
> @@ -1718,6 +1845,7 @@ int do_syslog(int type, char __user *buf, int len, int source)
> struct printk_info info;
> bool clear = false;
> static int saved_console_loglevel = LOGLEVEL_DEFAULT;
> + static int saved_ignore_per_console_loglevel;
> int error;
>
> error = check_syslog_permissions(type, source);
> @@ -1758,19 +1886,28 @@ int do_syslog(int type, char __user *buf, int len, int source)
> break;
> /* Disable logging to console */
> case SYSLOG_ACTION_CONSOLE_OFF:
> - if (saved_console_loglevel == LOGLEVEL_DEFAULT)
> + if (saved_console_loglevel == LOGLEVEL_DEFAULT) {
> saved_console_loglevel = console_loglevel;
> + saved_ignore_per_console_loglevel =
> + ignore_per_console_loglevel;
> + }
> console_loglevel = minimum_console_loglevel;
> + ignore_per_console_loglevel = true;
> break;
> /* Enable logging to console */
> case SYSLOG_ACTION_CONSOLE_ON:
> if (saved_console_loglevel != LOGLEVEL_DEFAULT) {
> console_loglevel = saved_console_loglevel;
> + ignore_per_console_loglevel =
> + saved_ignore_per_console_loglevel;
> saved_console_loglevel = LOGLEVEL_DEFAULT;
> }
> break;
> /* Set level of messages printed to console */
> case SYSLOG_ACTION_CONSOLE_LEVEL:
> + if (!ignore_per_console_loglevel)
> + pr_warn_once(
> + "SYSLOG_ACTION_CONSOLE_LEVEL is ignored by consoles with an explicitly set per-console loglevel, see Documentation/admin-guide/per-console-loglevel\n");
> if (len < 1 || len > 8)
> return -EINVAL;
> if (len < minimum_console_loglevel)
> @@ -2060,11 +2197,8 @@ static u8 *__printk_recursion_counter(void)
>
> int printk_delay_msec __read_mostly;
>
> -static inline void printk_delay(int level)
> +static inline void printk_delay(void)
> {
> - if (suppress_message_printing(level))
> - return;
> -
> boot_delay_msec();
>
> if (unlikely(printk_delay_msec)) {
> @@ -2278,7 +2412,9 @@ asmlinkage int vprintk_emit(int facility, int level,
> const char *fmt, va_list args)
> {
> int printed_len;
> + int cookie;
> bool in_sched = false;
> + struct console *con;
>
> /* Suppress unimportant messages after panic happens */
> if (unlikely(suppress_printk))
> @@ -2293,7 +2429,14 @@ asmlinkage int vprintk_emit(int facility, int level,
> in_sched = true;
> }
>
> - printk_delay(level);
> + cookie = console_srcu_read_lock();
> + for_each_console_srcu(con) {
> + if (!suppress_message_printing(level, con)) {
> + printk_delay();
> + break;
> + }
> + }
> + console_srcu_read_unlock(cookie);
>
> printed_len = vprintk_store(facility, level, dev_info, fmt, args);
>
> @@ -2370,7 +2513,10 @@ static ssize_t msg_print_ext_body(char *buf, size_t size,
> struct dev_printk_info *dev_info) { return 0; }
> static void console_lock_spinning_enable(void) { }
> static int console_lock_spinning_disable_and_check(int cookie) { return 0; }
> -static bool suppress_message_printing(int level) { return false; }
> +static bool suppress_message_printing(int level, struct console *con)
> +{
> + return false;
> +}
> static bool pr_flush(int timeout_ms, bool reset_on_progress) { return true; }
> static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progress) { return true; }
>
> @@ -2410,11 +2556,78 @@ static void set_user_specified(struct console_cmdline *c, bool user_specified)
> console_set_on_cmdline = 1;
> }
>
> +static bool find_and_remove_console_option(char *options, const char *key,
> + char *val_buf, size_t val_buf_size)
> +{
> + bool found = false, first = true;
> + char *option, *next = options;
> +
> + while ((option = strsep(&next, ","))) {
> + char *value;
> +
> + value = strchr(option, ':');
> + if (value)
> + *(value++) = '\0';
> +
> + if (strcmp(option, key) == 0) {
> + found = true;
> + if (value) {
> + if (strlen(value) > val_buf_size - 1) {
> + pr_warn("Can't copy console option value for %s:%s: not enough space (%zu)\n",
> + option, value, val_buf_size);
> + found = false;
> + } else {
> + strscpy(val_buf, value, val_buf_size);
> + }
> + } else
> + *val_buf = '\0';
> + }
> +
> + if (found)
> + break;
> +
> + if (next)
> + *(next - 1) = ',';
> + if (value)
> + *(value - 1) = ':';
> +
> + first = false;
> + }
> +
> + if (found) {
> + if (next)
> + memmove(option, next, strlen(next) + 1);
> + else if (first)
> + *option = '\0';
> + else
> + *--option = '\0';
> + }
> +
> + return found;
> +}
> +
> +static int find_and_remove_loglevel_option(char *options)
> +{
> + char val[3];
> + int loglevel;
> +
> + if (!find_and_remove_console_option(options, "loglevel", val,
> + sizeof(val)))
> + return -ENOENT;
> +
> + if (kstrtoint(val, 10, &loglevel)) {
> + pr_warn("Invalid console loglevel, ignoring: %s\n", val);
> + return -EINVAL;
> + }
> +
> + return clamp_loglevel(loglevel);
> +}
> +
> static int __add_preferred_console(char *name, int idx, char *options,
> char *brl_options, bool user_specified)
> {
> struct console_cmdline *c;
> - int i;
> + int i, ret;
>
> /*
> * See if this tty is not yet registered, and
> @@ -2435,6 +2648,11 @@ static int __add_preferred_console(char *name, int idx, char *options,
> if (!brl_options)
> preferred_console = i;
> strscpy(c->name, name, sizeof(c->name));
> +
> + ret = find_and_remove_loglevel_option(options);
> + if (ret >= 0)
> + c->level = ret;
> +
> c->options = options;
> set_user_specified(c, user_specified);
> braille_set_options(c, brl_options);
> @@ -2764,7 +2982,8 @@ static void console_prepend_dropped(struct printk_message *pmsg, unsigned long d
> * of @pmsg are valid. (See the documentation of struct printk_message
> * for information about the @pmsg fields.)
> */
> -static bool printk_get_next_message(struct printk_message *pmsg, u64 seq,
> +static bool printk_get_next_message(struct printk_message *pmsg,
> + struct console *con, u64 seq,
> bool is_extended, bool may_suppress)
> {
> static int panic_console_dropped;
> @@ -2808,7 +3027,7 @@ static bool printk_get_next_message(struct printk_message *pmsg, u64 seq,
> }
>
> /* Skip record that has level above the console loglevel. */
> - if (may_suppress && suppress_message_printing(r.info->level))
> + if (may_suppress && suppress_message_printing(r.info->level, con))
> goto out;
>
> if (is_extended) {
> @@ -2851,7 +3070,7 @@ static bool console_emit_next_record(struct console *con, bool *handover, int co
>
> *handover = false;
>
> - if (!printk_get_next_message(&pmsg, con->seq, is_extended, true))
> + if (!printk_get_next_message(&pmsg, con, con->seq, is_extended, true))
> return false;
>
> con->dropped += pmsg.dropped;
> @@ -3199,6 +3418,144 @@ static int __init keep_bootcon_setup(char *str)
>
> early_param("keep_bootcon", keep_bootcon_setup);
>
> +#ifdef CONFIG_PRINTK
> +static ssize_t loglevel_show(struct device *dev, struct device_attribute *attr,
> + char *buf)
> +{
> + struct console *con = dev_get_drvdata(dev);
> +
> + return sysfs_emit(buf, "%d\n", con->level);
> +}
> +
> +static ssize_t loglevel_store(struct device *dev, struct device_attribute *attr,
> + const char *buf, size_t size)
> +{
> + struct console *con = dev_get_drvdata(dev);
> + ssize_t ret;
> + int level;
> +
> + ret = kstrtoint(buf, 10, &level);
> + if (ret < 0)
> + return ret;
> +
> + if (level == -1) {
> + con->level = level;
> + return size;
> + }
> +
> + if (clamp_loglevel(level) != level)
> + return -ERANGE;
> +
> + con->level = level;
> +
> + return size;
> +}
> +
> +static DEVICE_ATTR_RW(loglevel);
> +
> +static ssize_t effective_loglevel_source_show(struct device *dev,
> + struct device_attribute *attr,
> + char *buf)
> +{
> + struct console *con = dev_get_drvdata(dev);
> +
> + return sysfs_emit(buf, "%s\n",
> + console_effective_loglevel_source_str(con));
> +}
> +
> +static DEVICE_ATTR_RO(effective_loglevel_source);
> +
> +static ssize_t effective_loglevel_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct console *con = dev_get_drvdata(dev);
> +
> + return sysfs_emit(buf, "%d\n", console_effective_loglevel(con));
> +}
> +
> +static DEVICE_ATTR_RO(effective_loglevel);
> +
> +static ssize_t enabled_show(struct device *dev, struct device_attribute *attr,
> + char *buf)
> +{
> + struct console *con = dev_get_drvdata(dev);
> +
> + return sysfs_emit(buf, "%d\n", !!(con->flags & CON_ENABLED));
> +}
> +
> +static DEVICE_ATTR_RO(enabled);
> +
> +static struct attribute *console_sysfs_attrs[] = {
> + &dev_attr_loglevel.attr,
> + &dev_attr_effective_loglevel_source.attr,
> + &dev_attr_effective_loglevel.attr,
> + &dev_attr_enabled.attr,
> + NULL,
> +};
> +
> +ATTRIBUTE_GROUPS(console_sysfs);
> +
> +static void console_classdev_release(struct device *dev)
> +{
> + kfree(dev);
> +}
> +
> +static void console_register_device(struct console *con)
> +{
> + /*
> + * We might be called from register_console() before the class is
> + * registered. If that happens, we'll take care of it in
> + * printk_late_init.
> + */
> + if (IS_ERR_OR_NULL(console_class))
> + return;
> +
> + if (WARN_ON(con->classdev))
> + return;
> +
> + con->classdev = kzalloc(sizeof(struct device), GFP_KERNEL);
> + if (!con->classdev)
> + return;
> +
> + device_initialize(con->classdev);
> + dev_set_name(con->classdev, "%s%d", con->name, con->index);
> + dev_set_drvdata(con->classdev, con);
> + con->classdev->release = console_classdev_release;
> + con->classdev->class = console_class;
> + if (device_add(con->classdev))
> + put_device(con->classdev);
> +}
> +
> +static void console_setup_class(void)
> +{
> + struct console *con;
> + int cookie;
> +
> + /*
> + * printk exists for the lifetime of the kernel, it cannot be unloaded,
> + * so we should never end up back in here.
> + */
> + if (WARN_ON(console_class))
> + return;
> +
> + console_class = class_create(THIS_MODULE, "console");
> + if (!IS_ERR(console_class))
> + console_class->dev_groups = console_sysfs_groups;
> +
> + cookie = console_srcu_read_lock();
> + for_each_console_srcu(con)
> + console_register_device(con);
> + console_srcu_read_unlock(cookie);
> +}
> +#else /* CONFIG_PRINTK */
> +static void console_register_device(struct console *new)
> +{
> +}
> +static void console_setup_class(void)
> +{
> +}
> +#endif
> +
> /*
> * This is called by register_console() to try to match
> * the newly registered console with any of the ones selected
> @@ -3231,6 +3588,14 @@ static int try_enable_preferred_console(struct console *newcon,
> if (newcon->index < 0)
> newcon->index = c->index;
>
> + if (c->level > 0)
> + newcon->level = c->level;
> + else
> + newcon->level = -1;
> +
> + newcon->flags |= c->flags;
> + newcon->classdev = NULL;
> +
> if (_braille_register_console(newcon, c))
> return 0;
>
> @@ -3456,6 +3821,7 @@ void register_console(struct console *newcon)
> * register_console() completes.
> */
>
> + console_register_device(newcon);
> console_sysfs_notify();
>
> /*
> @@ -3523,6 +3889,9 @@ static int unregister_console_locked(struct console *console)
> */
> synchronize_srcu(&console_srcu);
>
> + if (console->classdev)
> + device_unregister(console->classdev);
> +
> console_sysfs_notify();
>
> if (console->exit)
> @@ -3627,6 +3996,10 @@ void __init console_init(void)
> * To mitigate this problem somewhat, only unregister consoles whose memory
> * intersects with the init section. Note that all other boot consoles will
> * get unregistered when the real preferred console is registered.
> + *
> + * Early consoles will also have been registered before we had the
> + * infrastructure to put them into /sys/class/console, so make sure they get
> + * set up now that we're ready.
> */
> static int __init printk_late_init(void)
> {
> @@ -3664,6 +4037,9 @@ static int __init printk_late_init(void)
> console_cpu_notify, NULL);
> WARN_ON(ret < 0);
> printk_sysctl_init();
> +
> + console_setup_class();
> +
> return 0;
> }
> late_initcall(printk_late_init);
> diff --git a/kernel/printk/sysctl.c b/kernel/printk/sysctl.c
> index c228343eeb97..688e1ebb7066 100644
> --- a/kernel/printk/sysctl.c
> +++ b/kernel/printk/sysctl.c
> @@ -7,10 +7,14 @@
> #include <linux/printk.h>
> #include <linux/capability.h>
> #include <linux/ratelimit.h>
> +#include <linux/console.h>
> #include "internal.h"
>
> static const int ten_thousand = 10000;
>
> +static int min_loglevel = LOGLEVEL_EMERG;
> +static int max_loglevel = CONSOLE_LOGLEVEL_MOTORMOUTH;
> +
> static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write,
> void *buffer, size_t *lenp, loff_t *ppos)
> {
> @@ -20,13 +24,48 @@ static int proc_dointvec_minmax_sysadmin(struct ctl_table *table, int write,
> return proc_dointvec_minmax(table, write, buffer, lenp, ppos);
> }
>
> +static int printk_sysctl_deprecated(struct ctl_table *table, int write,
> + void *buffer, size_t *lenp, loff_t *ppos)
> +{
> + int res = proc_dointvec(table, write, buffer, lenp, ppos);
> +
> + if (write)
> + pr_warn_once(
> + "printk: The kernel.printk sysctl is deprecated. Consider using kernel.console_loglevel or kernel.default_message_loglevel instead.\n");
> +
> + return res;
> +}
> +
> +static int printk_console_loglevel(struct ctl_table *table, int write,
> + void *buffer, size_t *lenp, loff_t *ppos)
> +{
> + struct ctl_table ltable = *table;
> + int ret, level;
> +
> + if (!write)
> + return proc_dointvec(table, write, buffer, lenp, ppos);
> +
> + ltable.data = &level;
> +
> + ret = proc_dointvec(&ltable, write, buffer, lenp, ppos);
> + if (ret)
> + return ret;
> +
> + if (level != -1 && level != clamp_loglevel(level))
> + return -ERANGE;
> +
> + console_loglevel = level;
> +
> + return 0;
> +}
> +
> static struct ctl_table printk_sysctls[] = {
> {
> .procname = "printk",
> .data = &console_loglevel,
> .maxlen = 4*sizeof(int),
> .mode = 0644,
> - .proc_handler = proc_dointvec,
> + .proc_handler = printk_sysctl_deprecated,
> },
> {
> .procname = "printk_ratelimit",
> @@ -76,6 +115,22 @@ static struct ctl_table printk_sysctls[] = {
> .extra1 = SYSCTL_ZERO,
> .extra2 = SYSCTL_TWO,
> },
> + {
> + .procname = "console_loglevel",
> + .data = &console_loglevel,
> + .maxlen = sizeof(int),
> + .mode = 0644,
> + .proc_handler = printk_console_loglevel,
> + },
> + {
> + .procname = "default_message_loglevel",
> + .data = &default_message_loglevel,
> + .maxlen = sizeof(int),
> + .mode = 0644,
> + .proc_handler = proc_dointvec_minmax,
> + .extra1 = &min_loglevel,
> + .extra2 = &max_loglevel,
> + },
> {}
> };
>
> --
> 2.40.0