Re: [RFC PATCH v2] x86/boot: add .sbat section to the bzImage

From: Luca Boccassi
Date: Wed Jul 12 2023 - 18:32:28 EST


On Wed, 12 Jul 2023 at 22:22, Willy Tarreau <w@xxxxxx> wrote:
>
> Hello,
>
> On Wed, Jul 12, 2023 at 09:41:23PM +0100, Luca Boccassi wrote:
> > > Also note that "single identifiers for individual issues" do NOT work
> > > for kernel fixes (and arguably do not work for any other software
> > > project either) as they fail to properly describe things.
> > >
> > > Think about Meltdown, one "identifier" of a CVE, and hundreds of
> > > patches. What if you happened to not backport one of them?
> > >
> > > Same goes for the issue reported last week or so, tens of fixes, over
> > > multiple stable kernel releases, for one "identifier", how would you
> > > have classified that?
> > >
> > > Anyway, I've been over this loads before, giving whole talks about this,
> > > there's a reason the kernel developers don't mess with CVEs (i.e.
> > > individual identifiers), they fail to work.
> >
> > There is no 'single identifier for individual issues' nor CVE involved
> > here. The purpose of the generation id (which is per EFI component,
> > not per bug) is to let the boot process know whether an EFI component
> > should be accepted or rejected, in a way that doesn't exhaust nvram.
> > Issues are not individually singled out, and there is no direct
> > correlation with CVEs. It doesn't matter how many fixes there are, or
> > how many bugs, if a generation of a component is vulnerable in any way
> > that matters, then it gets denied.
>
> I refrained from chiming in but I'm really reading shocking stuff here,
> so please let me make a few comments based on some old experience.
>
> Several times in this thread you seemed to imply that there is "someone" or
> "something" that knows whether or not a kernel is absolutely vulnerable
> or absolutely trustable regarding a certain bug, when developers
> themselves only have an estimate about it, whose probability quickly
> fades away with the depth of backports.

There is no such implication. This is about _known_ good state,
nothing absolute about that.

> When I was in charge of extended 2.6.32 many years ago, the Debian kernel
> team helped me by occasionally sending me series of backports of fixes
> for issues I had missed, and fixes for backports I had failed. That's the
> principle of maintenance: adding incremental fixes that make the whole
> code better.
>
> With your process (OK you said it's not yours, but then why adopt it when
> it doesn't match the workflow of the software it tries to adapt to), how
> would we proceed ? "Let's bump this ID now that the new 2.6.32.233 has
> everything fixed". Or rather "let's *not* bump it because nobody knows
> how to backport this other stuff that's blocking the ID increment". Then
> once finally bumped, one month later we figure that the fixes were still
> incorrect due to important differences in the older branches, and have
> to be fixed again, so according to what I understand, we must then
> immediately revoke the current ID, that is also shared by upstream and
> all correctly fixed maintenance branches, and have to emit a new one
> for all branches at once even if the code didn't change, just because
> myself incompetent stable maintainer of the week-end failed to fix
> something non-obvious at once ? If so, I'm sorry but this is non-sense.
> There must be another approach to this or it was designed by someone
> having never met a bug in person!

The other approach is fine-grained revocations, but as already
explained that's the status quo and demonstrably cannot work for this
problem. Coarse-grained revocations have some drawbacks, and yes if
you screw up hard enough it might need a re-roll, and if the screw up
is really bad and in an upstream component, then everyone gets a
do-over. Though luck! Screw ups happen, but they are not the end of
the world. Guess what, that happened in Debian - the downstream Grub
generation id was bumped but due to tooling issues the actual binaries
did not have the required patches applied, so another bump was needed
immediately after. I can confirm that the sky did not fall as a
consequence of that. Besides, the vast majority of the work involved
here is with the people doing tracking and coordination, not with
kernel developers, that's the good news for you.

You could devise a scheme, still allowed by the protocol, where each
branch gets its own component name, so they all have separate
generation ids. Here's one of the problems with that: instead of being
(for all intents and purposes) fixed in size, the revocation list
would instead grow by (number of releases) * (number of distributions)
lines every year, forever. It would also require a lot more
coordination and tracking work. Is it possible? Yes. Is it worth it
just to answer a strawman case? No.

> What I'm also wondering is, if in the end it turns out that only the
> distro has the skills to decide which kernel version is fixed and which
> one isn't (after all, it's the distro who chooses the config and the
> compiler, both are as much involved in bugs as the code itself), then
> why not make sort of a wrapper or an envelope around an existing kernel
> image, which provides this ID that the distro can freely choose, then
> transfer the control to the embedded kernel image ? This might give
> the distro the freedom to proceed as it wants with no cross-dependency
> on kernel branches.

The issue there is tracking - revocations are global, not per-distro.
So you need to ensure that the upstream component generation id is the
same everywhere. What's the best way to ensure this coherency, if not
by storing it in the upstream component tree directly? Of course it
can be patched downstream, but that imposes a lot of busywork on
everyone since they'll need out-of-band management. While if you don't
care about security, secure boot and/or sbat upstream, it costs
nothing to have it, apart from applying patches when they are sent.

> > > Pointing to an external document that is thousands of lines long,
> > > talking about bootloaders, is NOT a good way to get people to want to
> > > accept a kernel patch :)
> >
> > Then how about just asking for that? "Hello submitter, please send a
> > v2 with a detailed summary of the problem being solved for those of us
> > who are not familiar with it, thank you"
>
> Probably that there was a problem with process in the first place by
> which someone asks some maintainers to accept to merge, maintain and
> become responsible for breaking changes they disagree with, without
> having even being presented to them before being developed ?

I don't think that's what [RFC] means.