Re: [PATCH v2 1/4] tools/nolibc: sys.h: add __syscall() and __sysret() helpers

From: Zhangjin Wu
Date: Fri Jun 09 2023 - 00:42:53 EST


Hi, Thomas, David, Willy

> Hi David,
>
> On 2023-06-08 14:35:49+0000, David Laight wrote:
> > From: Zhangjin Wu
> > > Sent: 06 June 2023 09:10
> > >
> > > most of the library routines share the same code model, let's add two
> > > helpers to simplify the coding and shrink the code lines too.
> > >
> > ...
> > > +/* Syscall return helper, set errno as -ret when ret < 0 */
> > > +static inline __attribute__((always_inline)) long __sysret(long ret)
> > > +{
> > > + if (ret < 0) {
> > > + SET_ERRNO(-ret);
> > > + ret = -1;
> > > + }
> > > + return ret;
> > > +}
> >
> > If that right?
> > I thought that that only the first few (1024?) negative values
> > got used as errno values.
> >

Thanks David, this question did inspire me to think about the syscalls
who returns pointers, we didn't touch them yet:

static __attribute__((unused))
void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset)
{
void *ret = sys_mmap(addr, length, prot, flags, fd, offset);

if ((unsigned long)ret >= -4095UL) {
SET_ERRNO(-(long)ret);
ret = MAP_FAILED;
}
return ret;
}

If we convert the return value to 'unsigned long' for the pointers, this
compare may be compatible with the old 'long' ret compare 'ret < 0',

/* Syscall return helper, set errno as -ret when ret is in [-4095, -1]
*/
static __inline__ __attribute__((unused, always_inline))
long __sysret(unsigned long ret)
{
if (ret >= -4095UL) {
SET_ERRNO(-(long)ret);
ret = -1;
}
return ret;
}

Or something like musl does:

/* Syscall return helper, set errno as -ret when ret is in [-4095, -1] */
static __inline__ __attribute__((unused, always_inline))
long __sysret(unsigned long ret)
{
if (ret > -4096UL) {
SET_ERRNO(-ret);
return -1;
}
return ret;
}

So, it reserves 4095 error values (I'm not sure where documents this,
perhaps we need a stanard description in the coming commit message), the
others can be used as pointers or the other data.

If this is ok for you, we may need to renew the v3 series [1] or add
this as an additional patchset (which may be better for us to learn why
we do this) to add the support for the syscalls who return pointers, I
did prepare such a series yesterday, welcome more discussions.

[1]: https://lore.kernel.org/linux-riscv/cover.1686135913.git.falcon@xxxxxxxxxxx/

> > Do all Linux architectures even use negatives for error?
> > I thought at least some used the carry flag.
> > (It is the historic method of indicating a system call failure.)
>
> I guess you are thinking about the architectures native systemcall ABI.
>
> In nolibc these are abstracted away in the architecture-specific
> assembly wrappers: my_syscall0 to my_syscall6.
> (A good example would be arch-mips.h)

Yes, thanks. mips may be the only arch nolibc currently supported who
has separated ret and errno.

The manpage of syscall lists more: alpha, ia64, sparc/32, sparc/64, tile.

https://man7.org/linux/man-pages/man2/syscall.2.html

>
> These normalize the architecture systemcall ABI to negative errornumbers
> which then are returned from the sys_* wrapper functions.
>

For mips, it is:

#define my_syscall0(num) \
({ \
register long _num __asm__ ("v0") = (num); \
register long _arg4 __asm__ ("a3"); \
\
__asm__ volatile ( \
"addiu $sp, $sp, -32\n" \
"syscall\n" \
"addiu $sp, $sp, 32\n" \
: "=r"(_num), "=r"(_arg4) \
: "r"(_num) \
: "memory", "cc", "at", "v1", "hi", "lo", \
"t0", "t1", "t2", "t3", "t4", "t5", "t6", "t7", "t8", "t9" \
); \
_arg4 ? -_num : _num; \
})

I did learn some difference from musl, it did this as following:

static inline long __syscall0(long n)
{
register long r7 __asm__("$7");
register long r2 __asm__("$2");
__asm__ __volatile__ (
"addu $2,$0,%2 ; syscall"
: "=&r"(r2), "=r"(r7)
: "ir"(n), "0"(r2)
: SYSCALL_CLOBBERLIST, "$8", "$9", "$10");
return r7 && r2>0 ? -r2 : r2;
}

It checks "r2>0" to make sure only convert 'r2' to negated when r2 is
positive number, I'm wondering this checking may be about the big
pointers, when its first highest bit is 1, then, that may be an issue,
if this guess is true, perhaps we should update this together with the
revision of __sysret().

Thanks very much.

Best regards,
Zhangjin

> The sys_* wrapper functions in turn are used by the libc function which
> translate the negative error number to the libc-style
> "return -1 and set errno" mechanism.
> At this point the new __sysret function is used.
>
> Returning negative error numbers in between has the advantage that it
> can be used without having to set up a global/threadlocal errno
> variable.
>
> In hope this helped,
> Thomas