Re: [PATCH v2 04/12] tools/nolibc: crt.h: add _start_c

From: Zhangjin Wu
Date: Mon Jul 10 2023 - 05:27:07 EST


Hi, Thomas

>
> On 2023-07-08 23:29:58+0800, Zhangjin Wu wrote:
> > As the environ and _auxv support added for nolibc, the assembly _start
> > function becomes more and more complex and therefore makes the porting
> > of nolibc to new architectures harder and harder.
> >
> > To simplify portability, this c version of _start_c() is added to do
> > most of the assembly start operations in C, which reduces the complexity
> > a lot and will eventually simplify the porting of nolibc to the new
> > architectures.
> >
> > The new _start_c() only requires a stack pointer argument, it will find
> > argv, envp and _auxv for us, and then call main(), finally, it exit()
> > with main's return status. With this new _start_c(), the future new
> > architectures only require to add very few assembly instructions.
>
> I like it!
>
> A quick test indicates that the initialization of the stackprotectors
> could also be moved into the C function.
>

Cool, do you mean directly call __stack_chk_init() at the beginning of
_start_c()?

> It also seems like a good opportunity to add some tests for
> argv/environment variable passing.

Yes, and even further, we can do more on auxv, just like musl does in
src/env/__libc_start_main.c, not that urgent currently:

libc.auxv = auxv = (void *)(envp+i+1);
...
__hwcap = aux[AT_HWCAP];
if (aux[AT_SYSINFO]) __sysinfo = aux[AT_SYSINFO];
...
libc.page_size = aux[AT_PAGESZ];

if (!pn) pn = (void*)aux[AT_EXECFN];
if (!pn) pn = "";
__progname = __progname_full = pn;
for (i=0; pn[i]; i++) if (pn[i]=='/') __progname = pn+i+1;

__init_tls(aux);
__init_ssp((void *)aux[AT_RANDOM]);

if (aux[AT_UID]==aux[AT_EUID] && aux[AT_GID]==aux[AT_EGID]
&& !aux[AT_SECURE]) return;

...
libc.secure = 1;

>
> And as general note to the full series I think that splitting the arch
> files is not necessary and confusing.
>

Ok, welcome to discuss more in this thread:

https://lore.kernel.org/lkml/20230710072340.10798-1-falcon@xxxxxxxxxxx/

and let's choose a better method as possible as we can, Just replied Willy to
explain more.

> > Signed-off-by: Zhangjin Wu <falcon@xxxxxxxxxxx>
> > ---
> > tools/include/nolibc/crt.h | 44 ++++++++++++++++++++++++++++++++++++++
> > 1 file changed, 44 insertions(+)
> >
> > diff --git a/tools/include/nolibc/crt.h b/tools/include/nolibc/crt.h
> > index 221b7c5346ca..b269294e9664 100644
> > --- a/tools/include/nolibc/crt.h
> > +++ b/tools/include/nolibc/crt.h
> > @@ -13,4 +13,48 @@
> > char **environ __attribute__((weak));
>
> The old code seems to avoid putting "environ" into the global symbol
> namespace. Could this declaration be moved into the function like in
> getenv()?
>

ok, do you mean just move it to stdlib.h like this? I moved _auxv (used
by getauxv()) to stdlib.h too:

tools/nolibc: move environ and _auxv from crt.h to stdlib.h

Move the definitions of environ and _auxv from crt.h to stdlib.h, where
the place who uses those definitions.

- getenv uses environ
- getauxv uses _auxcv

Signed-off-by: Zhangjin Wu <falcon@xxxxxxxxxxx>

diff --git a/tools/include/nolibc/crt.h b/tools/include/nolibc/crt.h
index b269294e9664..d2f84cbe73d0 100644
--- a/tools/include/nolibc/crt.h
+++ b/tools/include/nolibc/crt.h
@@ -10,14 +10,13 @@
#include "compiler.h"
#include "crt_arch.h"

-char **environ __attribute__((weak));
-const unsigned long *_auxv __attribute__((weak));
-
int main(int argc, char *argv[], char **envp);
static void exit(int);

void _start_c(long *sp)
{
+ extern char **environ;
+ extern const unsigned long *_auxv;
int argc, i;
char **argv;
char **envp;
diff --git a/tools/include/nolibc/stdlib.h b/tools/include/nolibc/stdlib.h
index 2f9b4b3c6d26..5eadadc2d0f5 100644
--- a/tools/include/nolibc/stdlib.h
+++ b/tools/include/nolibc/stdlib.h
@@ -14,6 +14,9 @@
#include "string.h"
#include <linux/auxvec.h>

+char **environ __attribute__((weak));
+const unsigned long *_auxv __attribute__((weak));
+
struct nolibc_heap {
size_t len;
char user_p[] __attribute__((__aligned__));

> > const unsigned long *_auxv __attribute__((weak));
> >
> > +int main(int argc, char *argv[], char **envp);
>
> This will lead to conflicting declarations if the users use a different
> signature. I'm not (yet?) sure how to work around this.
>

Ah yes, I forgot this critical case, people may use something like:

int main(void)
int main(int argc, char *argv[])

Just applied the method suggested by you in anothe reply [1]:

diff --git a/tools/include/nolibc/crt.h b/tools/include/nolibc/crt.h
index d2f84cbe73d0..8fe38ef8c5b2 100644
--- a/tools/include/nolibc/crt.h
+++ b/tools/include/nolibc/crt.h
@@ -10,7 +10,7 @@
#include "compiler.h"
#include "crt_arch.h"

-int main(int argc, char *argv[], char **envp);
+typedef int (_nolibc_main_fn)(int, char **, char **);
static void exit(int);

void _start_c(long *sp)
@@ -20,6 +20,7 @@ void _start_c(long *sp)
int argc, i;
char **argv;
char **envp;
+ _nolibc_main_fn _nolibc_main __asm__ ("main");

/*
* sp : argc <-- argument count, required by main()
@@ -53,7 +54,7 @@ void _start_c(long *sp)
_auxv = (void *)(envp + i + 1);

/* go to application */
- exit(main(argc, argv, envp));
+ exit(_nolibc_main(argc, argv, envp));
}

#endif /* _NOLIBC_CRT_H */

It works as expected, thanks very much!

> Also how is the case handled where main() returns "void"?

Do you mean the main() without an explicit "return 0"?

int main(int argc, char *argv[], char **envp)
{
printf("Hello\n");
}

Tested c89, c99, c2x, c23, gnu11, the same result as yours:

std | return value
----------|------------
c89 | 6 /* return value of printf() */
c99 | 0 /* set by compiler with: xorl %eax, %eax */
c2x | 0
gnu11 | 0

> I'm not sure how this is currently handled or if the compiler takes care
> of returning 0 in this case.

void main(void)
{
printf("Hello\n");
}

Tested c89, c99, c2x, c23, gnu11, the same result as yours, the return
value is 6, the return value is the return status of the last called
functon.

Thanks,
Zhangjin
---
[1]: https://lore.kernel.org/lkml/90fdd255-32f4-4caf-90ff-06456b53dac3@xxxxxxxx/