Re: [patch 7/8] fdmap v2 - implement sys_socket2

From: Davide Libenzi
Date: Thu Jun 07 2007 - 11:51:24 EST


On Thu, 7 Jun 2007, Eric Dumazet wrote:

> On Thu, 7 Jun 2007 07:59:47 -0400
> Kyle Moffett <mrmacman_g4@xxxxxxx> wrote:
>
>
> > Likewise there are a massive group of other libraries (especially
> > user-interface and server related ones) that would really like to
> > have support for creating file-descriptors without the top-level
> > application closing them randomly (like several shells seem to).
> >
>
> True, shells are sometimes quite strange.
>
> For example, bash uses file descriptor 255 (FD_CLOEXEC)
>
> When it forks a new process, child gets a file table with 256 slots.
>
> At exec() time, 255 is closed but file table doesnt shrink.
> (shrinking is done at fork() time only)
>
> With fdmap, that means each process started by bash uses at least
> 256 * sizeof(list_head) bytes, ie 4096 bytes on x86_64, even if only three
> file-descriptors are opened (0,1,2)
>
> FD_CLOFORK should help here (BTW : current patch from Davide doesnt take this
> into account and might need a change in fdmap_top_open_fd())

Yes, the CLOFORK flag is there, but it needs to be taken in account in
fdmap_top_open_fd().


> Davide, are you sure we want FIFO for non sequential allocations ?
>
> This tends to use all the fmap slots, and not very cache friendly
> if an app does a lot of [open(),...,close()] things. We already got a
> perf drop because of RCUification of file freeing (FIFO mode instead
> of LIFO given by kmalloc()/kfree())
>
> If the idea behind this FIFO was security (ie not easy for an app to predict
> next glibc file handle), we/glibc might use yet another FD_SECUREMODE flag,
> wich ORed with O_NONSEQFD would ask to fdmap_newfd() to take the tail of
> fmap->slist, not head.

That was the reason, yes. If we agree that the base randomization is
enough, we can use a LIFO.


- Davide


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/