Re: [PATCH 1/2 v2] fdmap(2)

From: Alexey Dobriyan
Date: Thu Sep 28 2017 - 06:10:37 EST


On 9/27/17, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> On Tue, Sep 26, 2017 at 12:00 PM, Alexey Dobriyan <adobriyan@xxxxxxxxx>
> wrote:
>> On Mon, Sep 25, 2017 at 09:42:58AM +0200, Michael Kerrisk (man-pages)
>> wrote:
>>> [Not sure why original author is not in CC; added]
>>>
>>> Hello Alexey,
>>>
>>> On 09/24/2017 10:06 PM, Alexey Dobriyan wrote:
>>> > From: Aliaksandr Patseyenak <Aliaksandr_Patseyenak1@xxxxxxxx>
>>> >
>>> > Implement system call for bulk retrieveing of opened descriptors
>>> > in binary form.
>>> >
>>> > Some daemons could use it to reliably close file descriptors
>>> > before starting. Currently they close everything upto some number
>>> > which formally is not reliable. Other natural users are lsof(1) and
>>> > CRIU
>>> > (although lsof does so much in /proc that the effect is thoroughly
>>> > buried).
>>> >
>>> > /proc, the only way to learn anything about file descriptors may not
>>> > be
>>> > available. There is unavoidable overhead associated with instantiating
>>> > 3 dentries and 3 inodes and converting integers to strings and back.
>>> >
>>> > Benchmark:
>>> >
>>> > N=1<<22 times
>>> > 4 opened descriptors (0, 1, 2, 3)
>>> > opendir+readdir+closedir /proc/self/fd vs fdmap
>>> >
>>> > /proc 8.31 Ä 0.37%
>>> > fdmap 0.32 Ä 0.72%
>>>
>>> From the text above, I'm still trying to understand: whose problem
>>> does this solve? I mean, we've lived with the daemon-close-all-files
>>> technique forever (and I'm not sure that performance is really an
>>> important issue for the daemon case) .
>>
>>> And you say that the effect for lsof(1) will be buried.
>>
>> If only fdmap(2) is added, then effect will be negligible for lsof
>> because it has to go through /proc anyway.
>>
>> The idea is to start process. In ideal world, only bynary system calls
>> would exist and shells could emulate /proc/* same way bash implement
>> /dev/tcp
>
> Then start the process by doing it for real and making it obviously
> useful. We should not add a pair of vaguely useful, rather weak
> syscalls just to start a process of modernizing /proc.
>
>>
>>> So, who does this new system call
>>> really help? (Note: I'm not saying don't add the syscall, but from
>>> explanation given here, it's not clear why we should.)
>>
>> For fdmap(2) natural users are lsof(), CRIU.
>
> lsof does:
>
> int
> main(argc, argv)
> int argc;
> char *argv[];
> {
> ...
> if ((MaxFd = (int) GET_MAX_FD()) < 53)
> MaxFd = 53;
> for (i = 3; i < MaxFd; i++)
> (void) close(i);
>
> The solution isn't to wrangle fdmap(2) into this code. The solution
> is to remove the code entirely.

What do you think about this code from OpenSSH?

/*
* Discard other fds that are hanging around. These can cause problem
* with backgrounded ssh processes started by ControlPersist.
*/
closefrom(STDERR_FILENO + 1);