Re: [PATCH 00/45] C++: Convert the kernel to C++

From: Michael de Lang
Date: Fri Jan 12 2024 - 16:58:53 EST

Next message: Ben Wolsieffer: "Re: [PATCH 1/2] clk: stm32: initialize syscon after clocks are registered"
Previous message: Joel Granados: "Re: [PATCH v2 0/6] IOMMUFD: Deliver IO page faults to user space"
In reply to: Michael de Lang: "Re: [PATCH 00/45] C++: Convert the kernel to C++"
Next in thread: John Hubbard: "Re: [PATCH 00/45] C++: Convert the kernel to C++"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Thanks for your reply.

Namely, to prevent stagnation for the Kernel as well as continue to be interesting to new developers.

Which stagnation are you talking about, exactly ?

While I do not know what Linus was exactly thinking about when he mentioned stagnation, I assume he was looking at it from the lens of long-term maintainers. I'm basing this on the 2021 discussion on lwn: https://lwn.net/Articles/870581/. Obviously there are plenty of contributors every kernel release and while I don't have any numbers there, I don't think # of contributors or # of contributions is an issue.

Still, the idea of C discouraging people to contribute resonates with me. That is largely based on subjectivity so feel free to ignore it.

While I've got a long list of ideas for modernizing the kernel
(which I'm lacking time to actually work on), I'm unsure whether
C++ really would be of much benefit. Especially considering that for
many things there's no way to know / define how things will really
look like on binary level.

Do you have any examples on what exactly in C++ obfuscates the resulting binary? Everything I can think of, also applies to C: anything implementation-defined, e.g. struct layout, high-order bit propagation for shift operations,

There are things in the STL that are implementation defined, but the proposal excludes the STL.

Personally, the opposite had been one my primary reasons.
Because it's so simple to understand - in contrast to the usual C++
monster's i've seen so often in the wild. (I usually try to keep far
away from C++ projects).

I have never understood the sentiment that C is supposedly simple. Looking at the macros used in the kernel is one obvious big argument against using C, as macros can be considered their own language-inside-a-language. Another big argument against the sentiment is the loose type system, where void* casts are everywhere you want to do anything remotely type-generic, losing type information and making it harder to grok the original intent.

Creating a compiler for C is 'easier' than creating one for C++ (or Rust for that matter), but coding in it as a user requires years of experience to avoid a lot of the pitfalls. A simple language would be something like golang, with its GC and prescribed coding patterns.

C is a language to be (ab)used like any other, the same goes for C++. The kernel has shown that it is possible to create maintainable C, I feel confident saying that it is also possible in C++.

> Note that C++ is a very complex language,
> and w/ STL it's even much, much more complex.

Note that the proposal here is to use C++ without the STL as well as apply some other restrictions.

Can't judge what you see as interesting, but frankly, I really don't
have it on my list of interesting things - instead would prefer phasing
C++ out in favour of many other languages.

I could give you concrete examples of C++ language addition examples, but I don't think that adds much to the discussion. Many languages, including C++, have additions that C does not have and provide benefits such as reduced cognitive load, standardised ways to do things preventing NIH syndrome and possibly enthuse more people to contribute to the kernel.

The biggest merit of using C++ in the kernel is that in comparison to other systems language (Zig, Rust, Swift to name a few) it requires the least re-skilling of existing contributors. A close second would be the low barrier to integrate various C++ and C codebases. Especially when taking into account the architectures that the kernel needs to support vs the other languages. Even Rust with its big push towards being a replacement isn't there yet today (e.g. PA-RISC).

other languages, unlike C. The aforementioned metaprogramming is one

Metaprogramming can be very interesting indeed - Oberon once made a really good show case, but I wouldn't dare trying that in kernel space.
And it's hard to do that w/o causing extra performance penalties.

I believe this is a case of having to try it first before being able to decisively say anything about the impact. Counter-examples have been mentioned elsewhere in the thread.

such example, but things like RAII, smart pointers and things like gsl::not_null would reduce the changes on kernel bugs, especially memory safety related bugs that are known to be vulnerable to security issues.

These are exactly the things I would prefer keeping out of kernel space.
Indeed there're several areas where it could be nice, but there're
others where we really can't take it.

As you mention yourself, there are places where such constructs would be a boon and places where we should not apply them. I have faith in the Kernel processes to weed out using things where they should not, as is presumably done already for certain C constructs today.

On the other hand, the benefits I mention can also turn into downsides: if constructs like gsl::not_null are desired, does that mean that there

this seems to be pretty much an assert() - obviously something we really
cannot have in the kernel.

gsl::not_null prevents constructing a pointer with NULL, ensuring at compile-time that it never happens. As such, an assert() would be superfluous. It is exactly an example of a C++ construct that has no downsides and only upsides.

will be a kernel-specific template library? A KTL instead of STL? That might be yet another thing that increases the steepness of the kernel development learning curve.

Most likely we'd need our own kernel specific library. (we also have one
instead of libc). Some simple pieces might look similar to STL on the
front, but it would have to be very different from userland.

At that point, your previous argument about attracting more people
who're already used to / like C++ breaks down, because it wouldn't be
that C++ as usual C++ devs know it (IIRC, STL is integral part of the
standard), but just the core lang plus some very custom template lib.

It's not that the argument breaks down, it's that it applies to a smaller, but still greater than 0, target audience. There are plenty of C++ programmers out there that disable the STL on purpose: game developers, automotive engineers that I know and so on. You're going to be hard-pressed to find concrete numbers, but the fact that the EASTL and ETL exist shows the proliferation of non-STL C++ and that the STL itself is not an integral part of C++. I recommend you check out ETL specifically, I'm sure you'll be amazed at how much functionality it has, especially geared for the embedded world.

Although compiler-specific, C++20 has enabled implementing RTTI without RTTI as well as (partial) reflection.

You name it: compiler specific.

Is it even specified how this exactly looks at binary level, and methods
to control the exact binary data structures ?

The least thing's need to implement such things is some pointer or tag
inside each struct/object instance - this would change struct layouts!
Note that we often use structs to reflect HW specific data structures,
so we'd need a way to have exact control over this. And then we need to
be very careful on which instances have RTTI and which ones don't.
I see debugging nightmares on the horizon ...

I could be convinced that RTTI of any sort is just a bad idea in the kernel. It is one of the things that is first to be disabled in embedded C++ usage, alongside exceptions. Still, it has its uses even in those areas, but that's outside of the scope of this proposal I think.

On top of increasing the binary size,

That's also a huge problem:

Templates in general have the strong tendency of producing lots of
duplicated code. That's what they're designed for: expressing similar
things (that have to be different on binary level) by the same
generic source code.

It might be possible to write them in a way they don't increase binary
size, but that's not entirely trivial, and so the actual gain of all
of that becomes questionable again.

Hmm, explicit template instantiations are an 'easy' fix to taming the code bloat, but any use of templates is going to mean _some_ extra code generation. I do not have any concrete Kernel examples here, but I'm sure there are switch/case statements somewhere in there that can be optimized away by using templates. For those, the question is: code bloat or run-time performance?

this then becomes a discussion on what requirements the kernel puts on compilers, as I'm sure that the kernel needs to be compiled for architectures which have a less than stellar conformance to the C++ specification.
Indeed. Also think about embedded environments, where folks can't easily
upgrade toolchains (e.g. due regulative constraints)

This argument also applies against using Rust and is directly opposed to modern security practices. Updating to the latest version for OS/compilers/libraries etc is pretty much a given since UN R155 and UN R156 came into effect. Though those apply only to automotive so far, the Cyber Resilience Act is going to force manufacturers of all kinds to adhere to better security. There is definitely a whole debate we can have just on the impacts of these regulations and what that should mean, but I've already written a lot ;)

Cheers,
Michael de Lang

Next message: Ben Wolsieffer: "Re: [PATCH 1/2] clk: stm32: initialize syscon after clocks are registered"
Previous message: Joel Granados: "Re: [PATCH v2 0/6] IOMMUFD: Deliver IO page faults to user space"
In reply to: Michael de Lang: "Re: [PATCH 00/45] C++: Convert the kernel to C++"
Next in thread: John Hubbard: "Re: [PATCH 00/45] C++: Convert the kernel to C++"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]