Re: nolibc patches, still possible for 6.5 ?

From: Paul E. McKenney
Date: Wed Jun 07 2023 - 18:58:08 EST


On Wed, Jun 07, 2023 at 11:19:43PM +0200, Willy Tarreau wrote:
> Hello Paul,
>
> On Sun, Jun 04, 2023 at 03:57:54PM -0700, Paul E. McKenney wrote:
> > On Sun, Jun 04, 2023 at 03:20:11PM +0200, Willy Tarreau wrote:
> > > Hello Paul,
> > >
> > > Thomas and Zhangjin have provided significant nolibc cleanups, and
> > > fixes, as well as preparation work to later support riscv32.
> > >
> > > These consist in the following main series:
> > > - generalization of stackprotector to other archs that were not
> > > previously supported (riscv, mips, loongarch, arm, arm64)
> > >
> > > - general cleanups of the makefile, test report output, deduplication
> > > of certain tests
> > >
> > > - slightly better compliance of some tests performed on certain syscalls
> > > (e.g. no longer pass (void*)1 to gettimeofday() since glibc hates it).
> > >
> > > - add support for nanoseconds in stat() and statx()
> > >
> > > - fixes for some syscalls (e.g. ppoll() has 5 arguments not 4)
> > >
> > > - fixes around limits.h and INT_MAX / INT_FAST64_MAX
> > >
> > > I rebased the whole series on top of your latest dev branch (d19a9ca3d5)
> > > and it works fine for all archs.
> > >
> > > I don't know if you're still planning on merging new stuff in this area
> > > for 6.5 or not (since I know that it involves new series of tests on your
> > > side as well), but given that Zhangjin will engage into deeper changes
> > > later for riscv32 that will likely imply to update more syscalls to use
> > > the time64 ones, I would prefer to split the cleanups from the hard stuff,
> > > but I'll let you judge based on the current state of what's pending for
> > > 6.5.
> > >
> > > In any case I'm putting all this here for now (not for merge yet):
> > >
> > > git://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/nolibc.git 20230604-nolibc-rv32+stkp6
> > >
> > > I'd like Thomas and Zhangjin to perform a last check to confirm they're
> > > OK with this final integration.
> >
> > Given that the testing converges by the end of this week, I can't see
> > any reason why these cannot make v6.5. (There were some kernel test
> > robot complaints as well, valid or not I am not sure.)
>
> After Thomas' and Zhangjin's reviews and checks, I could run a mostly
> complete check:
> - arm64, i386, x86_64 show 100% success
> - arm, mips: 100% success, stackprotector skipped
> - s390x, riscv64: run-user OK, kernel build fails (see below)
> - loongarch: build OK, just not executed (need to upgrade my qemu
> and I hate doing it late when some tests results are needed)

Very good!

> Regarding the build failure affecting s390x and riscv64, it's a regular
> kernel resulting from "make defconfig". For both archs, I'm getting this
> failure:
>
> In file included from kernel/rcu/update.c:649:
> kernel/rcu/tasks.h: In function 'get_rcu_tasks_gp_kthread':
> CC fs/kernfs/dir.o
> CC security/bpf/hooks.o
> kernel/rcu/tasks.h:1939:16: error: 'rcu_tasks' undeclared (first use in this function)
> 1939 | return rcu_tasks.kthread_ptr;
> | ^~~~~~~~~
> kernel/rcu/tasks.h:1939:16: note: each undeclared identifier is reported only once for each function it appears in
> kernel/rcu/tasks.h:1940:1: error: control reaches end of non-void function [-Werror=return-type]
> 1940 | }
> | ^
> cc1: some warnings being treated as errors
>
> I rebased the branch on top of 6.4-rc5 and got the same. I'm building
> with gcc-11.3.0 from kernel.org. I'm not sure whether this comes from
> my build environment or recent changes to the kernel, but I'm sure I
> haven't seen that error during 6.3-rc cycle. However, given that
> Zhangjin seems to have successfully built it for riscv, there might
> be something odd on my side.

That line of code is in rcu/dev but not in mainline yet. In fact, it
is not yet in -next.

But it is a bug. One that my Kconfig laziness hid from me. Easy fix,
but it is clearly time for me to stop being lazy about that part of the
Kconfig setup. :-/

So thank you for reporting it!

> Given that this build issue is not dependent on the selftest, I'm fine
> with the branch getting merged as-is, and can provide feedback on this
> build error if needed:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/nolibc.git 20230606-nolibc-rv32+stkp7a
>
> Just let me know if you prefer that I resend the whole series or need
> more info etc, as usual.

I will pull it from your tree, test it, and if all goes well, rebase it
on my existing nolibc stack.

Longer term, both to avoid you having to deal with RCU bugs and to make
it easier to have multiple administrative nolibc maintainers, it might
work better for you to base your stack on vX.y-rc1. That way, I could
just pull directly from your tree.

This works because you buffer up the commits and test them, which
makes it completely reasonable for me to simply pull your new stack
and merge them in. Which also means that if there are multiple nolibc
administrative maintainers, we have exactly the same set of nolibc
commits in our respective trees, right down to the SHA-1 hashes.

This approach is used a lot, for example, back when my RCU patches
went through Ingo Molnar, he pulled from my tree so that mainline's RCU
patches were identical to mine, again, right down to the SHA-1 hashes.

This is something to think about for some upcoming cycle, given that
we are already pretty much set up for the upcoming merge window.

Your choice, either way works for me.

Thanx, Paul