Re: [PATCH 0/1] RFC: Revamp admin-guide/tainted-kernels.rst to make it more comprehensible

From: Thorsten Leemhuis
Date: Mon Jan 07 2019 - 13:56:54 EST


Am 03.01.19 um 19:12 schrieb Jonathan Corbet:
> On Fri, 21 Dec 2018 16:26:31 +0100
> Thorsten Leemhuis <linux@xxxxxxxxxxxxx> wrote:
>>> Here's an idea if you feel like improving this: rather than putting an
>>> inscrutable program inline, add a taint_status script to scripts/ that
>>> prints out the status in fully human-readable form, with the explanation
>>> for every set bit.
>> I posted the script earlier today and noticed now that it prints only
>> the fully human-readable form, not if a bit it set or unset. Would you
>> prefer if it did that as well?
> Not sure I have an opinion; perhaps if it can be done in a readable way
> putting more information is better than less.

I think I found a way, the script output looks like this now:

Kernel is Tainted for following reasons:
* Proprietary module was loaded (#0)
* Kernel issued warning (#9)
* Externally-built ('out-of-tree') module was loaded (#12)
For a more detailed explanation of the various taint flags see
Documentation/admin-guide/tainted-kernels.rst in the the Linux kernel sources
or https://kernel.org/doc/html/latest/admin-guide/tainted-kernels.html
Raw taint value as int/string: 4609/'P W O

>>>> +=== === ====== ========================================================
>>>> +Bit Log Int Reason that got the kernel tainted
>>>> +=== === ====== ========================================================
>>>> + 1) G/P 0 proprietary module got loaded
>>> I'd s/got/was/ throughout. Also, this is the kernel, we start counting at
>>> zero! :)
>> Hehe, yeah :-D At first I actually started at zero, but that looked
>> odd as the old explanations (those already in the file) start to could at one.
>> Having a off-by-one within one document is just confusing, that's why I
>> decided against starting at zero here.
>> Another reason that came to my mind when reading your comment: Yes, this
>> is the kernel, but the document should be easy to understand even for
>> inexperienced users (e.g. people that know how to open and use command
>> line tools, but never learned programming). That's why I leaning towards
>> starting with one everywhere. But yes, that can be confusing, that's
>> why I added a note, albeit I'm not really happy with it yet:
>> """
>> Note: This document is aimed at users and thus starts to count at one here and
>> in other places. Use ``seq 0 17`` instead to start counting at zero, as it's
>> normal for developers.
>> """
>> See below for full context. Anyway: I can change the text to start at zero if
>> you prefer it.
> This is a kernel document in the end, so I do really think that we should
> be consistent with kernel conventions.

Okay. I still don't like it, but well, maybe your are right. And in the
end we can change it easily later if we want to.

> [...]
>> 3) ``S`` if the oops occurred on an SMP kernel running on hardware that
>> hasn't been certified as safe to run multiprocessor.
>> Currently this occurs only on various Athlons that are not
>> SMP capable.
> I wonder if any such hardware has ever run anything remotely resembling a
> current kernel. In any case, a quick grep suggests that this taint can be
> set in a number of other places as well.

I looked into this and...

> [...]
>> 11) ``C`` if a staging driver has been loaded.
> There's a couple of other situations where this one is set as well; not
> sure if it's worth the trouble to try to describe them.

...this, but decided that takes things too far for now. Thus I'll leave
those as they are for now, but will take a closer look and start a discussion
dedicated to this with the relevant parties that use those flags.

>> 17) ``X`` Auxiliary taint, defined for and used by Linux distributors.
> Do we know anything about whether anybody uses this?

Seems SUSE does: https://www.suse.com/de-de/support/kb/doc/?id=3582750
Or at least did in the not to distant past, which I'd say is good enough
for now.

> [...]
> Overall, just nits except for the start-with-zero thing.

All the other nits stripped from the reply fixed, will sent out and
update patch series tomorrow.

Side note FYI: While at it I decided to update the tainted section in
Documentation/sysctl/kernel.txt and reuse the short description
used it the table of the revamped tainted-kernels.rst, which results
in the patch at the end (sigh, this patch slowly gets too big):

Ciao, Thorsten

diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index 1b8775298cf7..8e1c21e1fdf6 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -93,7 +93,7 @@ show up in /proc/sys/kernel:
- stop-a [ SPARC only ]
- sysrq ==> Documentation/admin-guide/sysrq.rst
- sysctl_writes_strict
-- tainted
+- tainted ==> Documentation/admin-guide/tainted-kernels.rst
- threads-max
- unknown_nmi_panic
- watchdog
@@ -1005,36 +1005,31 @@ compilation sees a 1% slowdown, other systems and workloads may vary.

==============================================================

-tainted:
+tainted

Non-zero if the kernel has been tainted. Numeric values, which can be
ORed together. The letters are seen in "Tainted" line of Oops reports.

- 1 (P): A module with a non-GPL license has been loaded, this
- includes modules with no license.
- Set by modutils >= 2.4.9 and module-init-tools.
- 2 (F): A module was force loaded by insmod -f.
- Set by modutils >= 2.4.9 and module-init-tools.
- 4 (S): Unsafe SMP processors: SMP with CPUs not designed for SMP.
- 8 (R): A module was forcibly unloaded from the system by rmmod -f.
- 16 (M): A hardware machine check error occurred on the system.
- 32 (B): A bad page was discovered on the system.
- 64 (U): The user has asked that the system be marked "tainted". This
- could be because they are running software that directly modifies
- the hardware, or for other reasons.
- 128 (D): The system has died.
- 256 (A): The ACPI DSDT has been overridden with one supplied by the user
- instead of using the one provided by the hardware.
- 512 (W): A kernel warning has occurred.
- 1024 (C): A module from drivers/staging was loaded.
- 2048 (I): The system is working around a severe firmware bug.
- 4096 (O): An out-of-tree module has been loaded.
- 8192 (E): An unsigned module has been loaded in a kernel supporting module
- signature.
- 16384 (L): A soft lockup has previously occurred on the system.
- 32768 (K): The kernel has been live patched.
- 65536 (X): Auxiliary taint, defined and used by for distros.
-131072 (T): The kernel was built with the struct randomization plugin.
+ 1 (P): proprietary module was loaded
+ 2 (F): module was force loaded
+ 4 (S): SMP kernel oops on an officially SMP incapable processor
+ 8 (R): module was force unloaded
+ 16 (M): processor reported a Machine Check Exception (MCE)
+ 32 (B): bad page referenced or some unexpected page flags
+ 64 (U): taint requested by userspace application
+ 128 (D): kernel died recently, i.e. there was an OOPS or BUG
+ 256 (A): an ACPI table was overridden by user
+ 512 (W): kernel issued warning
+ 1024 (C): staging driver was loaded
+ 2048 (I): workaround for bug in platform firmware applied
+ 4096 (O): externally-built ("out-of-tree") module was loaded
+ 8192 (E): unsigned module was loaded
+ 16384 (L): soft lockup occurred
+ 32768 (K): kernel has been live patched
+ 65536 (X): Auxiliary taint, defined and used by for distros
+131072 (T): The kernel was built with the struct randomization plugin
+
+See Documentation/admin-guide/tainted-kernels.rst for more information.

==============================================================