Re: [PATCH 00/45] C++: Convert the kernel to C++

From: John Hubbard
Date: Wed Jan 10 2024 - 23:24:37 EST


On 1/9/24 11:57, H. Peter Anvin wrote:
Hi all, I'm going to stir the hornet's nest and make what has become the ultimate sacrilege.

Andrew Pinski recently made aware of this thread. I realize it was released on April 1, 2018, and either was a joke or might have been taken as one. However, I think there is validity to it, and I'm going to try to motivate my opinion here.


In 2018 it may have been taken as a joke, but in 2024 with Rust for Linux
upon us, C++ sounds just plain brilliant. Thank you so much for this proposal.

Both C and C++ has had a lot of development since 1999, and C++ has in fact, in my personal opinion, finally "grown up" to be a better C for the kind of embedded programming that an OS kernel epitomizes. I'm saying that as the author of a very large number of macro and inline assembly hacks in the kernel.

What really makes me say that is that a lot of things we have recently asked for gcc-specific extensions are in fact relatively easy to implement in standard C++ and, in many cases, allows for infrastructure improvement *without* global code changes (see below.)

C++14 is in my option the "minimum" version that has reasonable metaprogramming support has most of it without the type hell of earlier versions (C++11 had most of it, but C++14 fills in some key missing pieces).

However C++20 is really the main game changer in my opinion; although earlier versions could play a lot of SFINAE hacks they also gave absolutely useless barf as error messages. C++20 adds concepts, which makes it possible to actually get reasonable errors.

I was writing a lot of C++ in the late 1990's and early 2000's, and personally
lived through the template error madness in particular. Verity Stob had a
wonderful riff on it in her 2001 "Double Plus Good?" article [1].

But one thing I do wonder about is the template linker bloat that was
endemic: multiple instantiations of templates were not de-duplicated
by the linkers of the day, and things were just huge. 20 years later,
perhaps it is all better I hope?


We do a lot of metaprogramming in the Linux kernel, implemented with some often truly hideous macro hacks. These are also virtually impossible to debug. Consider the uaccess.h type hacks, some of which I designed and wrote. In C++, the various casts and case statements can be unwound into separate template instances, and with some cleverness can also strictly enforce things like user space vs kernel space pointers as well as already-verified versus unverified user space pointers, not to mention easily handle the case of 32-bit user space types in a 64-bit kernel and make endianness conversion enforceable.


This sounds glorious.

Now, "why not Rust"? First of all, Rust uses a different (often, in my opinion, gratuitously so) syntax, and not only would all the kernel developers need to become intimately familiar to the level of getting the same kind of "feel" as we have for C, but converting C code to Rust isn't something that can be done piecemeal, whereas with some cleanups the existing C code can be compiled as C++.


Beyond the syntax, which I'm trying to force myself not to focus on, the
compatibility layers are turning out to be quite extensive. This is just
another way of saying that Rust is a deeply, completely different language.
Whereas C++ is closer to a dialect, as far as building and linking anyway.

However, I find that I disagree with some of David's conclusions; in fact I believe David is unnecessarily *pessimistic* at least given modern C++.

Note that no one in their sane mind would expect to use all the features of C++. Just like we have "kernel C" (currently a subset of C11 with a relatively large set of allowed compiler-specific extensions) we would have "kernel C++", which I would suggest to be a strictly defined subset of C++20 combined with a similar set of compiler extensions.) I realize C++20 compiler support is still very new for obvious reasons, so at least some of this is forward looking.

There was an effort to address this, and I remember we even tried to use
it: Embedded C++ [2]. This is very simplistic and completely out of date
compared to what is being considered here, but it does show that many
others have had the same reaction: the language is so large that it
wants to be constrained. We actually wrote to Bjarne Stroustrup around
that time and asked about both embedded C++ and coding standards, and
his reaction was, "don't limit the language, just use education instead".

However, in my experience since then, that fails, and you need at least
coding standards. Because people will use *everything* they have available,
unless they can't. :)

Tentatively, coding standards are a better way forward, as opposed to
actually constraining the language (and maybe finding out later that
you wish it was left unconstrained), IMHO.


[1] https://link.springer.com/chapter/10.1007/978-1-4302-0003-1_63
[2] https://en.wikipedia.org/wiki/Embedded_C%2B%2B


thanks,
--
John Hubbard
NVIDIA