Re: fork: out of memory

Andi Kleen (ak@muc.de)
25 Nov 1997 14:41:46 +0100


Zlatko Calusic <Zlatko.Calusic@CARNet.hr> writes:

> alan@lxorguk.ukuu.org.uk (Alan Cox) writes:
>
> > > Maybe it would be a wise idea to make few pointers instead of
> > > fd[NR_OPEN]. Every pointer would point to a smaller table of let's say
> > > 64 file descriptors and would be allocated as needed. First such table
> > > would be in files_struct itself.
> >
> > Its very important to be able to do the files check fast. What seems
> > more sane to me is
> >
> > struct files_struct
> > {
> > int count;
> > int limit;
> > fd_set close_on_exec;
> > fd_set open_fds;
> > struct file *fd[0]
> > };
> >
> > and to allocate initially on a 64 fd break point. So you malloc
> > one files_struct + 64 * (struct file *). That does however requre
> > you write the code atomically and safely handle growing the file table
> > - which is actually quite hard if you want speed.
> >
>
> Well, I'm currently researching throughout the kernel, finding
> misterious places where files_struct is directly or indirectly used.
> Definitely lots of places. :)
>
> My idea is to put:
>
> int curr_max_fd; and
> struct file **fds; (instead of *fd[NR_OPEN])
>
> and then allocate first set of 64 struct file * (as you suggested,
> too).
>
> Later, when I ran out of fd's, I would allocate next (256 - 64)
> pointers. And if even that isn't enough, (1024 - (256 - 64)).
>
> These extra allocations would go to get_unused_fd() in fs/open.c.
> I didn't find any other place (but I just started working on it, it will
> take some time).
>
> Some preliminary searching showed me that I will have to modify at
> least 50 files!!! No big things, but absolutely neccessary if you ask
> me (unless I have problems in logic :)).

Please convert them all to use file_from_fd() (from linux/sched.h). I used
to have patches for this, but they're completely out of date now.

While you do this please fix fs/select.c too. In 2.1 it allocates one
page per wait. This means if you have 30 processes waiting in select()
you waste 30*4K=120K (or 30*8K on alpha and sparc64) of unswappable
memory that is mostly not used. 2.0 didn't use memory for this at all,
because it put the temporary fd_sets on the kernel stack. I used to
have a patch for 2.1.16 that fixed this by using alloca() when you
pass less than 256 fds to select, but Linus didn't like it because of
the small overhead it added. Perhaps it should be revisited..

Another point:

2.1 include/linux.fs defines both NR_OPEN and NR_FILE to 1024. The rlimits
are set per default to 1024 too. open() reserves a few fds for root, but
main(){for(;;)open("/dev/null",O_RDONLY);} still works very good as a
denial-of-service attack. This should be really fixed before 2.2: either by
lowering the default ulimits or some other clever way.

Regarding your other issue: if you look at earlier versions of Mark Hemment's
slab patch he added a files_struct that is allocated in 64fd pieces. I don't
know why his code never appeared in the mainstream kernel, but perhaps
some of his work can be reused here. All of this becomes very tricky
when you aim at multithreaded SMP code with kernel threads.

-Andi