Re: [PATCH v5] tools/nolibc: fix up size inflate regression

From: Zhangjin Wu
Date: Mon Aug 14 2023 - 06:43:54 EST

Next message: Jan Kara: "Re: [PATCH] ext4: fix slab-use-after-free in ext4_es_insert_extent()"
Previous message: David Laight: "RE: [PATCH] printk: ringbuffer: Fix truncating buffer size min_t cast"
In reply to: Willy Tarreau: "Re: [PATCH v5] tools/nolibc: fix up size inflate regression"
Next in thread: David Laight: "RE: [PATCH v5] tools/nolibc: fix up size inflate regression"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi, Willy

> On Sun, Aug 13, 2023 at 09:26:20PM +0800, Zhangjin Wu wrote:
> [...]
> > With this exception, s390 no long need to provide its own mmap
> > definition, it (seems i386 too, but it uses mmap2 currently) can simply
> > define '__ARCH_WANT_SYS_OLD_MMAP' as the '__ARCH_WANT_SYS_OLD_SELECT' we
> > are using for old_select.
> >
> > The same method applies to the selection of the different backward
> > version of the sys_clone() syscall (from kernel/fork.c):
> (...)
>
> > #ifdef __NR_clone
> > #undef sys_clone
> > #define __sys_clone(...) __sysdef(clone, __VA_ARGS__)
> >
> > static __attribute__((unused))
> > int sys_clone(unsigned long clone_flags, unsigned long newsp,
> > int __attribute__((unused)) stack_size,
> > int parent_tidptr, int child_tidptr, unsigned long tls)
> > {
> > long ret;
> > #ifdef __ARCH_WANT_SYS_CLONE_BACKWARDS
> > ret = __sys_clone(clone_flags, newsp, parent_tidptr, tls, child_tidptr);
> > #elif defined(__ARCH_WANT_SYS_CLONE_BACKWARDS2)
> > ret = __sys_clone(newsp, clone_flags, parent_tidptr, child_tidptr, tls);
> > #elif defined(__ARCH_WANT_SYS_CLONE_BACKWARDS3)
> > ret = __sys_clone(clone_flags, newsp, stack_size, parent_tidptr, child_tidptr, tls);
> > #else
> > ret = __sys_clone(clone_flags, newsp, parent_tidptr, child_tidptr, tls);
> > #endif
> > return ret;
> > }
> > #endif /* __NR_clone */
> >
> > s390 only requires to define '__ARCH_WANT_SYS_CLONE_BACKWARDS2', no need
> > to provide its own sys_fork() version, in the __NR_clone branch of
> > fork(), __ARCH_WANT_SYS_CLONE_BACKWARDS2 can directly select the right
> > version of sys_clone() for s390).
>
> Maybe but with much less #define indirections it would be significantly
> better.
>
> (...)
> > We only have these three exceptions currently, with this normalization,
> > the library routines from sys.h can directly think sys_* macros are
> > generic, if not, let syscall.h take care of the right exceptions.
>
> I see the point. But that doesn't remove the need to write the exported
> function itself. I'm not saying there's nothing to save here, I see your
> point, I'm just wondering if we really gain something in terms of ease
> of declaring new syscalls especially for first-time contributors and if
> we're not losing in maintenance. If at least it's easy enough to implement
> exceptions, maybe it could be worth further investigating.
>

I will delay the whole work about __sysdef(), but work on some more generic
parts (like the exceptions above) at first.

> > > > static __attribute__((unused))
> > > > int dup2(int old, int new)
> > > > {
> > > > int ret = sys_dup3(old, new, 0);
> > > >
> > > > if (ret == -ENOSYS)
> > > > ret = sys_dup2(old, new);
> > > >
> > > > return __sysret(ret);
> > > > }
> > >
> > > But this will add a useless test after all such syscalls, we'd rather
> > > not do that!
> > >
> >
> > Indeed, I found this issue too, when __NR_dup3 not defined, it returns
> > -ENOSYS, than, no size issue, otherwise, the compiler will not be able
> > to learn what the ret of sys_dup3() will be, so, it can not optimize the
> > second call to sys_dup2().
> >
> > So, the '#ifdef' logic must be used like we did in sys_* functions, but
> > it is really not that meaningful (no big gain as you mentioned above) if
> > we only move them from the sys_* functions to the library routines.
> >
> > At last, I found the ternary operation together with the initialization
> > of the not-defined __NR_* as NOLIBC__NR_NOSYS help this a lot, at last,
> > we get something like this:
> >
> > /* __systry2() is used to select one of two provided low level syscalls */
> > #define __systry2(a, sys_a, sys_b) \
> > ((NOLIBC__NR_##a != NOLIBC__NR_NOSYS) ? (sys_a) : (sys_b))
>
> But this supposes that all of them are manually defined as you did above.
> I'd rather implement an ugly is_numeric() macro based on argument
> resolution. I've done it once in another project, I don't remember
> precisely where it is but I vaguely remember that it used to check
> that the string resolution of the argument gave a letter (when it
> does not exist) or a digit (when it does). I can look into that later
> if needed. But please avoid extra macro definitions as much as possible,
> they're a real pain to handle in the code. There's no error when one is
> missing or has a typo, it's difficult to follow them and they don't
> appear in the debugger.
>

Yeah, your reply inspired me to look into the IS_ENABLED() from
../include/linux/kconfig.h macro again, there was a __is_defined() there, let's
throw away the ugly sysnr.h. I thought of IS_ENABLED() was only for y/n/m
before, but it does return 0 when the macro is not defined, it uses the same
trick in syscall() to calculate the number of arguments, if the macro is not
defined, then, 0 "argument".

> > It can eliminate all of the '#ifdef' stuffs, using the chmod example you
> > mentioned above, it becomes something like this:
> >
> > /*
> > * int chmod(const char *path, mode_t mode);
> > */
> >
> > static __attribute__((unused))
> > int chmod(const char *path, mode_t mode)
> > {
> > return __sysret(__systry2(chmod, sys_chmod(path, mode), sys_fchmodat(AT_FDCWD, path, mode, 0)));
> > }
> >
> > Purely clean and clear.
>
> That's a matter of taste and it may explain why we often disagree. For me
> it's horrid. If I'm the one implementing chmod for my platform and it does
> not work, what should I do when facing that, except want to cry ? Think
> that right now we have this:
>
> static __attribute__((unused))
> int sys_chmod(const char *path, mode_t mode)
> {
> #ifdef __NR_fchmodat
> return my_syscall4(__NR_fchmodat, AT_FDCWD, path, mode, 0);
> #elif defined(__NR_chmod)
> return my_syscall2(__NR_chmod, path, mode);
> #else
> return -ENOSYS;
> #endif
> }
>
> Sure it can be called not pretty, but I think it has the merit of being
> totally explicit, and whoever sees chmod() fail can quickly check based
> on the test in what situation they're supposed to be and what to check.
>
> One thing I'm worried about also regarding using my_syscall() without the
> number is that it's easy to miss an argument and have random values taken
> from registers (or the stack) passed as argument. For example above we can
> see that the flags part is 0 in fchmodat(). It's easy to miss themn and
> while the syscall4() will complain, syscall() will silently turn that
> into syscall3(). That's not necessarily a big deal, but we've seen during
> the development that it's easy to make mistakes and they're not trivial
> to spot. So again I'm really wondering about the benefits in such a case.
>
> This is well illustrated in your example below:
>
> > return __sysret(__systry2(newselect, sys_newselect(nfds, rfds, wfds, efds, timeout),
> > sys_pselect6(nfds, rfds, wfds, efds, timeout ? &t : NULL, NULL)));
>
> How many attempts to get it right ? Just skip one NULL and you don't
> see it.

Yeah, seems we have missed the last 0 in ppoll() before and the test may not
report about it either.

[...]
> > > Sure it's not pretty, and I'd rather just go back to SET_ERRNO() to be
> > > honest, because we're there just because of the temptation to remove
> > > lines that were not causing any difficulties :-/
> > >
> > > I think we can do something in-between and deal only with signed returns,
> > > and explicitly place the test for MAX_ERRNO on the two unsigned ones
> > > (brk and mmap). It should look approximately like this:
> > >
> > > #define __sysret(arg) \
> > > ({ \
> > > __typeof__(arg) __sysret_arg = (arg); \
> > > (__sysret_arg < 0) ? ({ /* error ? */ \
> > > SET_ERRNO(-__sysret_arg); /* yes: errno != -ret */ \
> > > ((__typeof__(arg)) -1); /* return -1 */ \
> > > }) : __sysret_arg; /* return original value */ \
> > > })
> > >
> >
> > I like this one very much, a simple test shows, it saves one more byte.
>
> I'm going to do that and revert the 3 affected syscalls.
>

Ok.

> > Only a quesiton, why 'errno != -ret' has a '!'? and we have post-tab in
> > above two lines of __sysret() too, I have changed them to whitespaces.
>
[...]
>
> > But let them align with the others may be better, so, most of the sys_*
> > macros can be simply mapped with a simple line (all of them are
> > generated automatically), without the care of the return types changing.
> >
> > So, Willy, as a summary:
> >
> > - one solution is your new __sysret() + restore the original SET_ERRNO
> > for mmap and brk [1].
> >
> > - another solution is your new __sysret() + my patch [2] to let mmap and brk
> > return 'long' as the other sys_* function does.
>
> No, because it will completely break them when they'll need to access the
> second half of the memory, as I already explained somewhere else in one
> of these numerous discussions.
>

Sorry, will recheck this part later, please ignore it.

[...]
>
> > I will resell my proposed patchset above, at
> > least a RFC patchset, please ignore this currently ;-)
>
> Please allow me to breathe a little bit. Really I mean, I'm already worn
> by having to constantly review breaking changes that either introduce
> bugs or break maintainability, and to have to justify myself for things
> that I thought would be obvious to anyone. Massive changes are extremely
> time consuming to review, and trying to figure subtle breakage in such
> low-level stuff is even harder. I simply can't assign more time to this,
> particularly for the expected gains which for me or often perceived as
> losses of maintainability instead :-/
>

Take a rest, I will delay the whole __sysdef() stuff and continue to work on
the last tinyconfig patchset after v6.5, it is the last one before our early
rv32 work ;-)

Thanks,
Zhangjin

> Thanks,
> Willy

Next message: Jan Kara: "Re: [PATCH] ext4: fix slab-use-after-free in ext4_es_insert_extent()"
Previous message: David Laight: "RE: [PATCH] printk: ringbuffer: Fix truncating buffer size min_t cast"
In reply to: Willy Tarreau: "Re: [PATCH v5] tools/nolibc: fix up size inflate regression"
Next in thread: David Laight: "RE: [PATCH v5] tools/nolibc: fix up size inflate regression"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]