Re: [PATCH] init: Don't proxy console= to earlycon

From: Petr Mladek
Date: Fri Jul 14 2023 - 14:35:32 EST


First, I am sorry I sent the first mail too early by mistake.
(Friday evening effect).

On Fri 2023-07-14 11:21:09, Raul Rangel wrote:
> On Fri, Jul 14, 2023 at 10:38 AM Petr Mladek <pmladek@xxxxxxxx> wrote:
> >
> > On Mon 2023-07-10 09:30:19, Raul Rangel wrote:
> > > On Sun, Jul 9, 2023 at 8:43 PM Randy Dunlap <rdunlap@xxxxxxxxxxxxx> wrote:
> > > >
> > > >
> > > >
> > > > On 7/9/23 18:15, Mario Limonciello wrote:
> > > > > On 7/9/23 18:46, Randy Dunlap wrote:
> > > > >>
> > > > >>
> > > > >> On 7/7/23 18:17, Raul E Rangel wrote:
> > > > >>> Right now we are proxying the `console=XXX` command line args to the
> > > > >>> param_setup_earlycon. This is done because the following are
> > > > >>> equivalent:
> > > > >>>
> > > > >>> console=uart[8250],mmio,<addr>[,options]
> > > > >>> earlycon=uart[8250],mmio,<addr>[,options]
> > > > >>>
> > > > >>> In addition, when `earlycon=` or just `earlycon` is specified on the
> > > > >>> command line, we look at the SPCR table or the DT to extract the device
> > > > >>> options.
> > > > >>>
> > > > >>> When `console=` is specified on the command line, it's intention is to
> > > > >>> disable the console. Right now since we are proxying the `console=`
> > > > >>
> > > > >> How do you figure this (its intention is to disable the console)?
> > > > >
> > >
> > > https://www.kernel.org/doc/html/v6.1/admin-guide/kernel-parameters.html
> > > says the following:
> > > console=
> > > { null | "" }
> > > Use to disable console output, i.e., to have kernel
> > > console messages discarded.
> > > This must be the only console= parameter used on the
> > > kernel command line.
> > >
> > > earlycon= [KNL] Output early console device and options.
> > >
> > > When used with no options, the early console is
> > > determined by stdout-path property in device tree's
> > > chosen node or the ACPI SPCR table if supported by
> > > the platform.
> >
> > Sigh, I wasn't aware of this when we discussed the console= handling.
>
> It took a bit of digging to figure out what the actual intention was :)
>
> >
> > > The reason this bug showed up is that ChromeOS has set `console=` for a
> > > very long time:
> > > https://chromium.googlesource.com/chromiumos/platform/crosutils/+/main/build_kernel_image.sh#282
> > > I'm not sure on the exact history, but AFAIK, we don't have the ttyX devices.
> > >
> > > Coreboot recently added support for the ACPI SPCR table which in
> > > combination with the
> > > `console=` arg, we are now seeing earlycon enabled when it shouldn't be.
> >
> > But this happens only when both "earlycon" and "console=" parameters
> > are used together. Do I get it correctly?
>
> The bug shows up when an SPCR table is present and the `console=`
> parameter is set. No need to specify `earlycon` on the command line.

Strange, see below.

> > This combination is ambiguous on its own. Why would anyone add
> > "earlycon" parameter and wanted to keep it disabled?
>
> This is not the case I'm hitting. I'm honestly not sure what the
> behavior should be in the `earlycon console=` case?
>
> >
> > > > >>> diff --git a/init/main.c b/init/main.c
> > > > >>> index aa21add5f7c54..f72bf644910c1 100644
> > > > >>> --- a/init/main.c
> > > > >>> +++ b/init/main.c
> > > > >>> @@ -738,8 +738,7 @@ static int __init do_early_param(char *param, char *val,
> > > > >>> for (p = __setup_start; p < __setup_end; p++) {
> > > > >>> if ((p->early && parameq(param, p->str)) ||
> > > > >>> (strcmp(param, "console") == 0 &&
> > > > >>> - strcmp(p->str, "earlycon") == 0)
> > > > >>> - ) {
> > > > >>> + strcmp(p->str, "earlycon") == 0 && val && val[0])) {
> > > > >>> if (p->setup_func(val) != 0)
> > > > >>> pr_warn("Malformed early option '%s'\n", param);
> > > > >>> }

My understanding is that this code in do_early_param() allows to call
param_setup_earlycon() with the @val defined via console=val.
It reduces cut&paste on the kernel command line.

It should never enable an early console when "earlycon" is not defined
on the command line. Otherwise, console=uart[8250],mmio,<addr>[,options]
would always enable earlycon as well.

If the "earlycon" is not defined on the command line then
we should never call param_setup_earlycon() in the first place.

Or the behavior is even more crazy than I thought.

> >
> > + "console" enables the default console which might be overridden
> > by ACPI SPCR and devicetree
>
> That's what this patch fixes. You need to specify `earlycon` in order
> to get the ACPI SPCR or DT console.

It sounds strange. earlycon is needed only for debugging. While
ACPI SPRC or DT should define the preferred console by the platform.

There are three levels of preference:

+ console= parameter defines the user preferred. It overrides
everything.

+ ACPI SPCR or DT should define the preferred console by
platform. It will be used when there is no user preference.

+ Kernel registers the first initialized console with tty driver
when the is no preferred console by the user, ACPI SPCR, or DT.

As I said, I would expect that early console is enabled only when
earlycon parameter is defined on the command line.

In each case, it seems that acpi_parse_spcr() and of_console_check()
call add_preferred_console() even when earlycon is not defined
on the commandline.

> I don't see the `console` (without the =) documented:
> https://www.kernel.org/doc/html/v6.1/admin-guide/kernel-parameters.html.
> I'm guessing this is an undocumented "feature" that snuck in while the
> `earlycon` stuff was being added.

Honestly, I do not see where the "console" without '=' is handled.
console_setup() does not check if the @str parameter is NULL.


Anyway, the behavior already is complicated. But it might still
make some sense when:

+ "earlycon" parameter would try to call param_setup_earlycon()
with @val from "console=val" parameter. It reduces cut&paste.

+ "console=" causes that "ttynull" driver gets preferred. Which might
cause that no console driver gets registered at all. [*]

But seems to be yet another level of craziness when "console" or
"console=" would affect whether the early console will try
to be defined via ACPI SPCR or not.

I believe that this patch solves the problem. But it looks
like a workaround which makes the logic even more tricky/hacky.


IMHO, the right fix is to make sure that param_setup_earlycon()
should get called only when "earlycon" is defined on the commandline.

Best Regards,
Petr