[RFC] tools/nolibc: replace duplicated -ENOSYS return with single -ENOSYS return

From: Zhangjin Wu
Date: Sun Aug 27 2023 - 04:34:03 EST


Hi, Willy

Since we have already finished the size inflate regression task [1], to share
and discuss the progress about the -ENOSYS return work, here launchs a new
thread, it is split from [2].

[1]: https://lore.kernel.org/lkml/ZNtszQeigYuItaKA@xxxxxx/
[2]: https://lore.kernel.org/lkml/20230814172233.225944-1-falcon@xxxxxxxxxxx/#R

This is only for brain storming, it is far from a solution ;-)

>
> > [...]
> > > >
> > > > /* __systry2() is used to select one of two provided low level syscalls */
> > > > #define __systry2(a, sys_a, sys_b) \
> > > > ((NOLIBC__NR_##a != NOLIBC__NR_NOSYS) ? (sys_a) : (sys_b))
> > >
> > > But this supposes that all of them are manually defined as you did above.
> > > I'd rather implement an ugly is_numeric() macro based on argument
> > > resolution. I've done it once in another project, I don't remember
> > > precisely where it is but I vaguely remember that it used to check
> > > that the string resolution of the argument gave a letter (when it
> > > does not exist) or a digit (when it does). I can look into that later
> > > if needed. But please avoid extra macro definitions as much as possible,
> > > they're a real pain to handle in the code. There's no error when one is
> > > missing or has a typo, it's difficult to follow them and they don't
> > > appear in the debugger.
> > >
> >
> > Yeah, your reply inspired me to look into the IS_ENABLED() from
> > ../include/linux/kconfig.h macro again, there was a __is_defined() there, let's
> > throw away the ugly sysnr.h. I thought of IS_ENABLED() was only for y/n/m
> > before, but it does return 0 when the macro is not defined, it uses the same
> > trick in syscall() to calculate the number of arguments, if the macro is not
> > defined, then, 0 "argument".
> >
>
> The above trick is only for ""#define something 1" ;-)
>

Here shares a little progress on this, I have found it is easy to implement an
ugly is_numeric() like macro as following:

/* Imported from include/linux/stringify.h */
#define __stringify_1(x...) #x
#define __stringify(x...) __stringify_1(x)

/*
* Check __NR_* definition by stringizing
*
* - The stringizing is to silence compile error about undefined macro
* - If defined, the result looks like "3", "(4000 + 168)", not begin with '_'
* - If not defined, the result looks like "__NR_read", begins with '_'
*/

#define __is_nr_defined(nr) ___is_nr_defined(__stringify(nr))
#define ___is_nr_defined(str) (str[0] != '_')

__is_nr_defined() is able to check if __NR_xxx is defined, but the harder part
is getting the number of defined __NR_* without the error about undefined
macro.

Of course, we can also use the __stringify() trick to do so, but it is
expensive (bigger size, worse performance) to unstringify and get the number
again, the expensive atoi() 'works' for the numeric __NR_*, but not work for
(__NR_*_base + offset) like __NR_* definitions (used by ARM and MIPS), a simple
interpreter is required for such cases and it is more expensive than atoi().

/* not for ARM and MIPS */

static int atoi(const char *s);
#define __get_nr(name) __nr_atoi(__stringify(__NR_##name))
#define __nr_atoi(str) (str[0] == '_' ? -1L : ___nr_atoi(str))
#define ___nr_atoi(str) (str[0] == '(' ? -1L : atoi(str))

Welcome more discussion or let's simply throw away this direction ;-)

But it may really help us to drop tons of duplicated code pieces like this:

#ifdef __NR_xxxx
...
#else
return -ENOSYS;
#endif

David, Thomas and Arnd, any inspiration on this, or is this really impossible
(or make things worse) in language level? ;-)

What I'm thinking about is something like this or similar (As Willy commented
before, the __sysdef() itself is not that good, please ignore itself, the core
target here is using a single -ENOSYS return for all of the undefined
branches):

#define __sysdef(name, ...) \
(__is_nr_defined(__NR_##name) ? my_syscall(__get_nr(name), ##__VA_ARGS__) : (long)-ENOSYS)

Or as Arnd replied in an old email thread before, perhaps the whole #ifdef's
code piece (and even the input types and return types of sys_*) above can be
generated from .tbl or the generic unistd.h automatically in the sysroot
installation stage?

BR,
Zhangjin