Re: fork: out of memory

Stephen C. Tweedie (sct@dcs.ed.ac.uk)
Sun, 30 Nov 1997 23:42:32 GMT


Hi,

On Sun, 30 Nov 1997 15:54:17 -0500, "Theodore Y. Ts'o" <tytso@MIT.EDU>
said:

> Is that really the case that a page will always be mapped into the
> same virtual address? Consider a data file which is mmaped in by
> various different processes; there is no guarantee that that they will
> all be mapped into the same place.

By a "data" page I mean a page allocated as a private data page owned by
a single process, as generated by brk() or through COW access to a
MAP_PRIVATE memory-mapped vma. If we allow such regions to be mmap()ed
from one vma to another (eg. through mmap()ing of /proc/nn/mem), then
certainly they can be multiply mapped, but otherwise they will always
occur at the same address, since the only way they can be shared is
through fork() (which does not change the vm address of the page).

> For the case of text and data pages, though, it would be useful if we
> could we did add the virtual address location into the struct page for
> this common case. For the case of arbitrary mmap's, it would seem to me
> that we could make life much easier by having a circularly linked list
> of a structure which described all of the processes and virtual memory
> locations that a particular page was mapped in.

We've already got something close to that for mmap()s --- the struct
page points to the inode it maps, and each inode maintains a circular
list of all mmap()s. Extending that to finer granularity by keeping
separate vma lists for each page would increase the memory requirements
enormously in many cases, so it's not necessarily going to be a clear
win.

What we don't have is anything approaching this for private pages. One
cheap way to help here might be to keep a record of the calling pid
whenever a private page is created, and for every process, maintain a
list of all children which have not yet execve()d (and which therefore
may still share data pages with the parent). It would probably not be
too hard to maintain these records of "related" processes, and if we
also have the va in the struct page, then it becomes much cheaper to
determine exactly who is mapping any private page (especially for the
common case where there is only one mapper). Actually, thinking about
this, we'd want to record related mm contexts, not related processes, so
that we would only have one context to deal with in the case of threads.

Cheers,
Stephen.