Re: [PATCH v2] Makefile.llvm: simplify LLVM build

From: Masahiro Yamada
Date: Sat Mar 28 2020 - 22:03:35 EST


On Sat, Mar 28, 2020 at 1:54 PM Masahiro Yamada <masahiroy@xxxxxxxxxx> wrote:
>
> On Sat, Mar 28, 2020 at 7:42 AM Nathan Chancellor
> <natechancellor@xxxxxxxxx> wrote:
> >
> > Sorry for the delay in review :(
> >
> > On Tue, Mar 17, 2020 at 02:55:15PM -0700, Nick Desaulniers wrote:
> > > Prior to this patch, building the Linux kernel with Clang
> > > looked like:
> > >
> > > $ make CC=clang
> > >
> > > or when cross compiling:
> > >
> > > $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make CC=clang
> > >
> > > which got very verbose and unwieldy when using all of LLVM's substitutes
> > > for GNU binutils:
> > >
> > > $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make CC=clang AS=clang \
> > > LD=ld.lld AR=llvm-ar NM=llvm-nm STRIP=llvm-strip \
> > > OBJCOPY=llvm-objcopy OBJDUMP=llvm-objdump OBJSIZE=llvm-objsize \
> > > READELF=llvm-readelf HOSTCC=clang HOSTCXX=clang++ HOSTAR=llvm-ar \
> > > HOSTLD=ld.lld
> > >
> > > This change adds a new Makefile under scripts/ which will be included in
> > > the top level Makefile AFTER CC and friends are set, in order to make
> > > the use of LLVM utilities when building a Linux kernel more ergonomic.
> > >
> > > With this patch, the above now looks like:
> > >
> > > $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make LLVM=y
> > >
> > > Then you can "opt out" of certain LLVM utilities explicitly:
> > >
> > > $ ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- make LLVM=y AS=as
> > >
> > > will instead invoke arm-linux-gnueabihf-as in place of clang for AS.
> > >
> > > Or when not cross compiling:
> > >
> > > $ make LLVM=y AS=as
> > >
> > > This would make it more verbose to opt into just one tool from LLVM, but
> > > this patch doesn't actually break the old style; just leave off LLVM=y.
> > > Also, LLVM=y CC=clang would wind up prefixing clang with $CROSS_COMPILE.
> > > In that case, it's recommended to just drop LLVM=y and use the old
> > > style. So LLVM=y can be thought of as default to LLVM with explicit opt
> > > ins for GNU, vs the current case of default to GNU and opt in for LLVM.
> > >
> > > A key part of the design of this patch is to be minimally invasive to
> > > the top level Makefile and not break existing workflows. We could get
> > > more aggressive, but I'd prefer to save larger refactorings for another
> > > day.
> > >
> > > Finally, some linux distributions package specific versions of LLVM
> > > utilities with naming conventions that use the version as a suffix, ie.
> > > clang-11. In that case, you can use LLVM=<number> and that number will
> > > be used as a suffix. Example:
> > >
> > > $ make LLVM=11
> > >
> > > will invoke clang-11, ld.lld-11, llvm-objcopy-11, etc.
> > >
> > > About the script:
> > > The pattern used in the script is in the form:
> > >
> > > ifeq "$(origin $(CC))" "file"
> > > $(CC) := $(clang)
> > > else
> > > override $(CC) := $(CROSS_COMPILE)$(CC)
> > > endif
> > >
> > > "Metaprogramming" (eval) is used to template the above to make it more
> > > concise for specifying all of the substitutions.
> > >
> > > The "origin" of a variable tracks whether a variable was explicitly set
> > > via "command line", "environment", was defined earlier via Makefile
> > > "file", was provided by "default", or was "undefined".
> > >
> > > Variable assignment in GNU Make has some special and complicated rules.
> > >
> > > If the variable was set earlier explicitly in the Makefile, we can
> > > simply reassign a new value to it. If a variable was unspecified, then
> > > earlier assignments were executed and change the origin to file.
> > > Otherwise, the variable was explicitly specified.
> > >
> > > If a variable's "origin" was "command line" or "environment",
> > > non-"override" assignments are not executed. The "override" directive
> > > forces the assignment regardless of "origin".
> > >
> > > Some tips I found useful for debugging for future travelers:
> > >
> > > $(info $$origin of $$CC is $(origin CC))
> > >
> > > at the start of the new script for all of the variables can help you
> > > understand "default" vs "undefined" variable origins.
> > >
> > > $(info $$CC is [${CC}])
> > >
> > > in the top level Makefile after including the new script, for all of the
> > > variables can help you check that they're being set as expected.
> > >
> > > Link: https://www.gnu.org/software/make/manual/html_node/Eval-Function.html
> > > Link: https://www.gnu.org/software/make/manual/html_node/Origin-Function.html
> > > Link: https://www.gnu.org/software/make/manual/html_node/Implicit-Variables.html
> > > Link: https://www.gnu.org/software/make/manual/html_node/Override-Directive.html
> > > Suggested-by: Nathan Chancellor <natechancellor@xxxxxxxxx>
> > > Signed-off-by: Nick Desaulniers <ndesaulniers@xxxxxxxxxx>
> > > ---
> > > Changes V1 -> V2:
> > > * Rather than LLVM=1, use LLVM=y to enable all.
> > > * LLVM=<anything other than y> becomes a suffix, LLVM_SUFFIX.
> > > * strip has to be used on the LLVM_SUFFIX to avoid an extra whitespace.
> > >
> > >
> > > Makefile | 4 ++++
> > > scripts/Makefile.llvm | 30 ++++++++++++++++++++++++++++++
> > > 2 files changed, 34 insertions(+)
> > > create mode 100644 scripts/Makefile.llvm
> > >
> > > diff --git a/Makefile b/Makefile
> > > index 402f276da062..72ec9dfea15e 100644
> > > --- a/Makefile
> > > +++ b/Makefile
> > > @@ -475,6 +475,10 @@ KBUILD_LDFLAGS :=
> > > GCC_PLUGINS_CFLAGS :=
> > > CLANG_FLAGS :=
> > >
> > > +ifneq ($(LLVM),)
> > > +include scripts/Makefile.llvm
> > > +endif
> > > +
> > > export ARCH SRCARCH CONFIG_SHELL BASH HOSTCC KBUILD_HOSTCFLAGS CROSS_COMPILE AS LD CC
> > > export CPP AR NM STRIP OBJCOPY OBJDUMP OBJSIZE READELF PAHOLE LEX YACC AWK INSTALLKERNEL
> > > export PERL PYTHON PYTHON3 CHECK CHECKFLAGS MAKE UTS_MACHINE HOSTCXX
> > > diff --git a/scripts/Makefile.llvm b/scripts/Makefile.llvm
> > > new file mode 100644
> > > index 000000000000..0bab45a100a3
> > > --- /dev/null
> > > +++ b/scripts/Makefile.llvm
> > > @@ -0,0 +1,30 @@
> > > +LLVM_SUFFIX=
> > > +
> > > +ifneq ($(LLVM),y)
> > > +LLVM_SUFFIX += -$(LLVM)
> > > +endif
> > > +
> > > +define META_set =
> > > +ifeq "$(origin $(1))" "file"
> > > +$(1) := $(2)$(strip $(LLVM_SUFFIX))
> > > +else
> > > +override $(1) := $(CROSS_COMPILE)$($(1))
> > > +endif
> > > +endef
> > > +
> > > +$(eval $(call META_set,CC,clang))
> > > +$(eval $(call META_set,AS,clang))
> > > +$(eval $(call META_set,LD,ld.lld))
> > > +$(eval $(call META_set,AR,llvm-ar))
> > > +$(eval $(call META_set,NM,llvm-nm))
> > > +$(eval $(call META_set,STRIP,llvm-strip))
> > > +$(eval $(call META_set,OBJCOPY,llvm-objcopy))
> > > +$(eval $(call META_set,OBJDUMP,llvm-objdump))
> > > +$(eval $(call META_set,OBJSIZE,llvm-objsize))
> > > +$(eval $(call META_set,READELF,llvm-readelf))
> > > +$(eval $(call META_set,HOSTCC,clang))
> > > +$(eval $(call META_set,HOSTCXX,clang++))
> > > +$(eval $(call META_set,HOSTAR,llvm-ar))
> > > +$(eval $(call META_set,HOSTLD,ld.lld))
> > > +
> > > +## TODO: HOSTAR, HOSTLD in tools/objtool/Makefile
> > > --


I also had planned to provide a single switch to change
all the tool defaults to LLVM.

So, supporting 'LLVM' is fine, but I'd rather want this
look symmetrical, and easy to understand.

CPP = $(CC) -E
ifneq ($(LLVM),)
CC = $(LLVM_DIR)clang
LD = $(LLVM_DIR)ld.lld
AR = $(LLVM_DIR)llvm-ar
NM = $(LLVM_DIR)llvm-nm
OBJCOPY = $(LLVM_DIR)llvm-objcopy
OBJDUMP = $(LLVM_DIR)llvm-objdump
READELF = $(LLVM_DIR)llvm-readelf
OBJSIZE = $(LLVM_DIR)llvm-size
STRIP = $(LLVM_DIR)llvm-strip
else
CC = $(CROSS_COMPILE)gcc
LD = $(CROSS_COMPILE)ld
AR = $(CROSS_COMPILE)ar
NM = $(CROSS_COMPILE)nm
OBJCOPY = $(CROSS_COMPILE)objcopy
OBJDUMP = $(CROSS_COMPILE)objdump
READELF = $(CROSS_COMPILE)readelf
OBJSIZE = $(CROSS_COMPILE)size
STRIP = $(CROSS_COMPILE)strip
endif



I attached two patches.
Comments appreciated.

--
Best Regards
Masahiro Yamada
From f3a620abee5e26dda04f7a747a54ab075f45b70d Mon Sep 17 00:00:00 2001
From: Masahiro Yamada <masahiroy@xxxxxxxxxx>
Date: Sat, 28 Mar 2020 18:57:25 +0900
Subject: [PATCH 2/2] kbuild: change the default of HOST{CC,CXX} to cc and c++

I have been thinking how to cater to those who want to build host
programs with Clang.

We could use the variable 'LLVM' to switch the default of HOST{CC,CXX}:

ifneq ($(LLVM),)
HOSTCC = clang
HOSTCXX = clang++
else
HOSTCC = gcc
HOSTCXX = g++
endif

But, I do not know how many people care about this. There is no tricky
code in userspace programs, and we know we can compile them with GCC,
Clang, or whatever. No matter what you use for host programs, it makes
no difference to the kernel itself.

If you are so addicted to LLVM or if you are using a system with no GCC
installation, probably you had already had 'cc' and 'c++' point to Clang.

So, another approach is to just leave this up to the system. You can
manually set up symlinks, or maybe your distro provides 'alternatives'.

$ update-alternatives --list cc
/usr/bin/clang
/usr/bin/gcc
$ update-alternatives --list c++
/usr/bin/clang++
/usr/bin/g++

I have no idea what to do for tools/objtool/Makefile, which uses HOSTAR
and HOSTLD, but this is because objtool intentionally opts out Kbuild.
If objtool is willing to join the Kbuild infrastructure, a patch exists:

https://patchwork.kernel.org/patch/10839051/

Signed-off-by: Masahiro Yamada <masahiroy@xxxxxxxxxx>
---
Makefile | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index f192f9bd8343..ecdd34ad0750 100644
--- a/Makefile
+++ b/Makefile
@@ -399,8 +399,8 @@ HOST_LFS_CFLAGS := $(shell getconf LFS_CFLAGS 2>/dev/null)
HOST_LFS_LDFLAGS := $(shell getconf LFS_LDFLAGS 2>/dev/null)
HOST_LFS_LIBS := $(shell getconf LFS_LIBS 2>/dev/null)

-HOSTCC = gcc
-HOSTCXX = g++
+HOSTCC = cc
+HOSTCXX = c++
KBUILD_HOSTCFLAGS := -Wall -Wmissing-prototypes -Wstrict-prototypes -O2 \
-fomit-frame-pointer -std=gnu89 $(HOST_LFS_CFLAGS) \
$(HOSTCFLAGS)
--
2.17.1

From 6620f13807b466c5e4af08b2d5d33f5a433b1e3f Mon Sep 17 00:00:00 2001
From: Masahiro Yamada <masahiroy@xxxxxxxxxx>
Date: Sat, 28 Mar 2020 15:54:47 +0900
Subject: [PATCH 1/2] kbuild: support 'LLVM' to switch the default tools to
Clang/LLVM

As Documentation/kbuild/llvm.rst implies, building the kernel with a
full set of LLVM tools gets very verbose and unwieldy.

Provide a single switch 'LLVM' to use Clang and LLVM tools instead of
GCC and Binutils. You can pass LLVM=1 from the command line or as an
environment variable. Then, Kbuild will use LLVM toolchains in your
PATH environment. This may not be convenient if you have multiple
versions of LLVM.

CROSS_COMPILE is used to specify not only the tool prefix, but also
the directory path to the tools. For example,

$ make ARCH=arm64 CROSS_COMPILE=/path/to/my/gcc/bin/aarch64-linux-gnu-

To support a similar flow, this commit adds another variable, LLVM_DIR,
to point to the specific installation of LLVM:

$ make LLVM=1 LLVM_DIR=/path/to/my/llvm/bin/

It might be tedious to set two variables. So, the following is the
shorthand:

$ make LLVM=/path/to/my/llvm/bin/

Please note LLVM=1 does not turn on the LLVM integrated assembler.
You need to pass AS=clang to use it. When the upstream kernel is
ready for the integrated assembler, we can make it default. We will
get rid of --no-integrated-as, then CROSS_COMPILE will be no longer
needed. The --target option will be specified by other means.

Signed-off-by: Masahiro Yamada <masahiroy@xxxxxxxxxx>
---
Documentation/kbuild/kbuild.rst | 6 ++++++
Documentation/kbuild/llvm.rst | 5 +++++
Makefile | 29 +++++++++++++++++++++++++----
3 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/Documentation/kbuild/kbuild.rst b/Documentation/kbuild/kbuild.rst
index f1e5dce86af7..39bb3636e4f4 100644
--- a/Documentation/kbuild/kbuild.rst
+++ b/Documentation/kbuild/kbuild.rst
@@ -262,3 +262,9 @@ KBUILD_BUILD_USER, KBUILD_BUILD_HOST
These two variables allow to override the user@host string displayed during
boot and in /proc/version. The default value is the output of the commands
whoami and host, respectively.
+
+LLVM
+----
+If this variable is set to 1, Kbuild will use Clang and LLVM utilities instead
+of GCC and GNU binutils to build the kernel.
+If set to a value other than 1, it points to directory path to LLVM to be used.
diff --git a/Documentation/kbuild/llvm.rst b/Documentation/kbuild/llvm.rst
index d6c79eb4e23e..4602369f6a4f 100644
--- a/Documentation/kbuild/llvm.rst
+++ b/Documentation/kbuild/llvm.rst
@@ -55,6 +55,11 @@ additional parameters to `make`.
READELF=llvm-readelf HOSTCC=clang HOSTCXX=clang++ HOSTAR=llvm-ar \\
HOSTLD=ld.lld

+You can use a single switch `LLVM=1` to use LLVM utilities by default (except
+for building host programs).
+
+ make LLVM=1 HOSTCC=clang HOSTCXX=clang++ HOSTAR=llvm-ar HOSTLD=ld.lld
+
Getting Help
------------

diff --git a/Makefile b/Makefile
index a3bc8bc562ee..f192f9bd8343 100644
--- a/Makefile
+++ b/Makefile
@@ -408,17 +408,38 @@ KBUILD_HOSTCXXFLAGS := -O2 $(HOST_LFS_CFLAGS) $(HOSTCXXFLAGS)
KBUILD_HOSTLDFLAGS := $(HOST_LFS_LDFLAGS) $(HOSTLDFLAGS)
KBUILD_HOSTLDLIBS := $(HOST_LFS_LIBS) $(HOSTLDLIBS)

+# LLVM=1 tells Kbuild to use Clang and LLVM utilities by default.
+# You can still override CC, LD, etc. individually if desired.
+#
+# If LLVM is set to a value other than 1, it is set to LLVM_DIR,
+# which is useful to select a specific LLVM installation.
+ifneq ($(filter-out 1,$(LLVM)),)
+LLVM_DIR := $(LLVM)
+endif
+
# Make variables (CC, etc...)
-LD = $(CROSS_COMPILE)ld
-CC = $(CROSS_COMPILE)gcc
CPP = $(CC) -E
+ifneq ($(LLVM),)
+CC = $(LLVM_DIR)clang
+LD = $(LLVM_DIR)ld.lld
+AR = $(LLVM_DIR)llvm-ar
+NM = $(LLVM_DIR)llvm-nm
+OBJCOPY = $(LLVM_DIR)llvm-objcopy
+OBJDUMP = $(LLVM_DIR)llvm-objdump
+READELF = $(LLVM_DIR)llvm-readelf
+OBJSIZE = $(LLVM_DIR)llvm-size
+STRIP = $(LLVM_DIR)llvm-strip
+else
+CC = $(CROSS_COMPILE)gcc
+LD = $(CROSS_COMPILE)ld
AR = $(CROSS_COMPILE)ar
NM = $(CROSS_COMPILE)nm
-STRIP = $(CROSS_COMPILE)strip
OBJCOPY = $(CROSS_COMPILE)objcopy
OBJDUMP = $(CROSS_COMPILE)objdump
-OBJSIZE = $(CROSS_COMPILE)size
READELF = $(CROSS_COMPILE)readelf
+OBJSIZE = $(CROSS_COMPILE)size
+STRIP = $(CROSS_COMPILE)strip
+endif
PAHOLE = pahole
LEX = flex
YACC = bison
--
2.17.1