Re: Patch 0727d35de ("Make initramfs honor CONFIG_DEVTMPFS_MOUNT") breaks boot

From: Rob Landley
Date: Tue May 23 2017 - 13:40:21 EST


On 05/23/2017 03:01 AM, Yury Norov wrote:
> On Mon, May 22, 2017 at 09:07:54PM -0500, Rob Landley wrote:
>> Your userspace mounted a tmpfs over /dev when it couldn't mount a second
>> identical instance of devtmpfs over itself. If you had a static /dev in
>> initramfs but didn't configure _in_ devtmpfs to your kernel, your broken
>> error path would have taken that out too with a pointless tmpfs mount.
>
> CONFIG_DEVTMPFS_MOUNT is enabled on my machine, so I think your
> suggestion is correct. But I didn't do that specifically - I run
> almost default kernel based on Ubuntu 14.04 config and environment.

I.E. ubuntu has a bug: they enabled CONFIG_DEVTMPFS_MOUNT and then
launchd an initramfs instead (which didn't do the automount they
requested so why request it), but if CONFIG_DEVTMPFS_MOUNT actually
starts working in initramfs they have an insane error path that breaks
the system, and does nothing _except_ break the system.

> Grepping the kernel code shows that arc, arm, arm64, m86k, metag,
> mips, nios2, openrisc, parisc, powerpc, sh, tile, um, x86 and xetensa
> enable it by default.

Most of which Ubuntu doesn't support, so none of them could trigger the
broken error path in ubuntu's init script.

Wait, are you saying you're doing a "make defconfig" on x86-64 and
booting ubuntu from the result? (Or is this arm?) Is _that_ the config
you still haven't specified in this conversation? I thought you were
using the /boot/config-4.4.0-78-generic and friends ubuntu installs.
(Which yes, also switch this symbol on.)

I can add a "default n" line to drivers/base/Kconfig if "make defconfig"
is what you're building from. (This symbol never specified a default in
the first place, so I dunno which way it falls, but it's repeated in a
gazillion defconfig files and not present in others... meaning I still
dunno which way the default goes. When I do a "make defconfig" it uses
arch/x86/configs/x86_64_defconfig because the kernel has multiple
codepaths to accomplish the same thing. I'm not sure the built-in
"default y" lines are used at all anymore? What a mess...)

But again, I'm just guessing what config you're using because you still
haven't _said_. I'm still trying to guess what you're doing when you hit
Ubuntu's bug.

> So it means for me that (at least) users why run
> Ubuntu 14.04 will have bricked system one day after updating the
> kernel.

Unless when they build their new kernel they open up menuconfig and
switch this symbol off. Which can't be done because...?

Or you could add the devtmpfs.mount=0 argument to your kernel command
line, as documented in the CONFIG_DEVTMPFS_MOUNT menuconfig help text.

The kernel already provides multiple workarounds for Ubuntu's bug, and
the issue only hits people who are manually building a new kernel from
source. If ubuntu provides a new kernel, I assume they'll tweak their
config _and_ fix their initramfs error path (which is just plain wrong).

> If you say that currently CONFIG_DEVTMPFS_MOUNT is ignored by kernel,

It's not _me_ saying it, it's the kernel doing it. The patch is
conceptually a straightforward fix on the kernel side to make it _not_
ignore that symbol in that context.

> I think you cannot relay on it anymore because people may have it
> enabled or disabled randomly.

I expected configs would have it randomly set, but the bug here is a
broken error path that does something actively harmful rather than going
"oh, we got a static /dev from somewhere, let's just leave it alone".
This error path goes out of its way to blank the contents of /dev by
mounting an empty tmpfs over it and leaving it empty, and then
complaining that /dev is blank _because_it_blanked_it_.

If Ubuntu meant to intentionally halt the proceedings the script could
have done that explicitly. What did the author of that error path think
would happen, exactly?

> So the proper way is to remove broken
> config option and introduce new one. BTW, I see it is used once in
> drivers/base/devtmpfs.c.

How does removing the broken config option (or renaming it to
CONFIG_DEFTMPFS_UBUNTU_IS_BROKEN) _not_ impact systems that were
previously happily using it in the contexts where it already worked?

If it's too much to ask people to switch it off when it was previously
on (but shouldn't have been), how is asking them to manually switch it
back on when it was previously on and needs to stay on better? (And if
you arrange it so "make oldconfig" migrates the old symbol to the new
one automatically, how would that work around the broken error path in
ubuntu's initramfs script? The rename becomes a NOP.)

If you're saying it should default to "n" I can send a patch. If you
want me to tweak every arch/*/configs file that redundantly includes the
same darn symbol, I can do that too. (Makes the patch big but it's just
a sed invocation to do it.)

>> By the way, _why_ are you mounting a tmpfs over /dev on _initramfs_?
>> That can already be tmpfs. (Commits 137fdcc18a59 through 6e19eded3684.)
>
> Have no idea. This is how default Ubuntu works.

Can you switch this symbol off in your kernel config and see if it boots
then? (Or are you using some sort of ubuntu kernel build tool that can
build -rc2 but not let you change config symbols? I still have no idea
how you're building this because you haven't said.)

I don't know how the kernel higher-ups want to deal with this because
it's more a political issue than a technical one. Enabling
DEVTMPFS_MOUNT for initramfs triggers a pair of bugs in Ubuntu: their
config was wrong and their error checking was wrong. Fixing _either_
would let the system boot, and there's also the command line workaround.
The issue only affects people who manually install new kernels on an old
system using the old (broken) config verbatim, and the simplest fix is
to flip the config symbol in menuconfig.

If you're building a defconfig the kernel supplies, we can edit that to
work around Ubuntu's bug. If you're building from a config ubuntu
supplies, _they_ have to edit it and then maybe bumping it a release
until Ubuntu can fix its breakage sounds reasonable. But so does letting
people manually adjust their configs when they manually upgrade the
kernel outside the normal ubuntu kernel uprade process. (When open
source breaks, you get to keep the pieces.)

> Yury

Rob