Re: [RFC] execve.2: SYNOPSIS: Document both glibc wrapper and kernel sycalls

From: Alejandro Colomar (man-pages)
Date: Thu Feb 18 2021 - 11:49:44 EST


Hi Micahel,

On 2/18/21 1:27 PM, Michael Kerrisk (man-pages) wrote:
Hi Alex,

On 2/14/21 2:39 PM, Alejandro Colomar wrote:
Until now, the manual pages have (usually) documented only either
the glibc (or another library) wrapper for a syscall, or the raw
syscall (this only when there's not a wrapper).

Let's document both prototypes, which many times are slightly
different. This will solve a problem where documenting glibc
wrappers implied shadowing the documentation for the raw syscall.

It will also be much clearer for the reader where the syscall
comes from (kernel? glibc? other?), by adding an explicit comment
at the beginning of the prototypes. This removes the need of
scrolling down to NOTES to see that info.

Signed-off-by: Alejandro Colomar <alx.manpages@xxxxxxxxx>
---

Hi all,

This is a prototype for doing some important changes to the SYNOPSIS
of the man-pages.

The commit message above explains the idea quite well. A few details
that couldn't be shown on this commit are:

For cases where the wrapper is provided by a library other than glibc,
I'd simply change the comment. For example, for move_pages(2),
it would say /* libnuma wrapper function: */.

I think this would make the samll notes warning that there's no glibc
wrapper function deprecated (but we could keep them for some time and
decide that later).

While changing this, I'd also make sure that the headers are correct,
and clearly differentiate which headers are needed for the raw syscall
and for the wrapper function.

This change will probably take more than one release of the man-pages
to complete.

Any thoughts?

My first impression is that I'm not keen on this. We'll add extra
text to all Section 2 pages, and in many (most?) cases the info
will be redundant (i.e., the wrapper and the syscall() notation
will express the same info). In other cases, I suspect the info
will be largely irrelevant to the user. To take an example: to
whom will the difference that you document below for execve()
matter, how will it matter, and does it matter enough that we
headline the info in the pages? I'd want cogent answers to
those questions before considering a wide-ranging change.

It will matter to:

1) Users of old systems where the glibc wrapper is not yet present.

3) Users of some unicorn Linux distributions that use a C library different than glibc and may not have wrappers for some syscalls that glibc provides.

2) Library (libc) developers.

Those won't have the glibc wrapper available for them, and will have to use syscall(2). The kernel syscall info would be highly valuable for them. However, the sum of them is probably not a big number of people.



There are indeed cases where the wrapper API differs in
significant ways from the syscall API (and these differences
are usually captured in the " C library/kernel differences"
subsections, such as for pselect()/pselect6() in select(2)).
But I imagine that that is the case in only a smallish
minority of the pages.

And indeed there are a very few syscalls that have wrappers
provided in another library. But it's a very small percentage
I think, and best documented case by case in specific pages.
The default presumption is that the wrapper is in the C library.

Agree.


There are other cases where I think it may be worthwhile
considering the syscall() notation:

1. Where the system call has no wrapper. In that case, we might
use the syscall() notation in the SYNOPISIS as both
(a) a clear indication that there is no wrapper and
(b) instructions to the reader about how to call the
system call using syscall().

Yes.


2. In cases where there is a "significant" difference between
the wrapper and the system call. In this case, we might
also place the syscall() notation in the SYNOPSIS, or
(perhaps more likely) in the NOTES

Yes.

I think it would be equally good to have the kernel syscall prototype in "C library/kernel ABI differences" in those cases where there is a glibc wrapper (even if it's quite different). It would be even better, as it would clearly mark the syscall(2) method as a second-class method, that should be avoided if possible. And also wouldn't add lines to the SYNOPSIS.

However, we should probably have that subsection for all syscalls, including those where the prototype is very similar to the glibc one, to support those who need to use the kernel syscall, and provide them with the exact types that the kernel expects.(except for those unsupported by libraries, of course, which would have the syscall(SYS_xxx) prototype in the SYNOPSIS).

I'll prepare a new RFC with this, with 2 pages: one with wrapper and one without wrapper.

Thanks,

Alex


See also:
<https://lwn.net/Articles/534682/>
<https://www.kernel.org/doc/man-pages/todo.html#migrate_to_kernel_source>



Thanks,

Michael


Thanks,

Alex

---
man2/execve.2 | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/man2/execve.2 b/man2/execve.2
index 639e3b4b9..87ff022ce 100644
--- a/man2/execve.2
+++ b/man2/execve.2
@@ -39,10 +39,18 @@
execve \- execute program
.SH SYNOPSIS
.nf
+/* Glibc wrapper function: */
.B #include <unistd.h>
.PP
-.BI "int execve(const char *" pathname ", char *const " argv [],
-.BI " char *const " envp []);
+.BI "int execve(const char *" pathname ",
+.BI " char *const " argv "[], char *const " envp []);
+.PP
+ /* Raw system call: */
+.B #include <sys/syscall.h>
+.B #include <unistd.h>
+.PP
+.BI "int syscall(SYS_execve, const char *" pathname ,
+.BI " const char *const " argv "[], const char *const " envp []);
.fi
.SH DESCRIPTION
.BR execve ()




--
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/