Re: Overcommitable memory??

From: Richard B. Johnson (
Date: Thu Mar 16 2000 - 18:21:37 EST

On Thu, 16 Mar 2000, James Sutherland wrote:

> On Wed, 15 Mar 2000, Paul Jakma wrote:
> > On Mon, 13 Mar 2000, Michael Bacarella wrote:
> >
> >
> > > The way I see it, apps that have successfully allocated memory in the past
> > > wouldn't start dying since there's no malloc() to fail, wheras new apps
> > > that want to bring down the system will start getting failed
> > > malloc()/mmap()'s
> >
> > no. Because a good app may have malloc()'ed memory an hour ago, and only
> > now try to write to it. Now the kernel had overcomitted on that
> > malloc(), and an hour ago things looked ok. But now when the kernel
> > tries to fulfill it's promise it finds it has no memory.
> That doesn't happen. malloc() ALLOCATES the memory to the process. It is
> *NOT* overcommitted. It may be backed by swapspace rather than physical
> memory, but that block of memory *IS* available to the process.

Not at all! It just sets an index pointing into its presumed heap-space
that might not even exist yet because nobody has tried to access it.
When it runs out of heap-space, it makes a system call to set a new break

The break-address is the virtual address above which the kernel will
seg-fault your process if an attempt is made to access it. Any access
below the break address, will force a page-fault if the page has not
actually been added to the process pages.

Basically you can do { for(;;) malloc(MAX_WHATEVER); } and, as long
as you don't touch those pages, you are home free.

The "normal" way for a user-program to allocate memory is through
malloc() and friends calloc(), etc. This does not actually allocate
anything! It just gets "permission to access" some memory in your
virtual address space. If you attempted to access the same pointer
value without getting the kernel's permission, the kernel will send
your process a fatal signal.

Once you have gotten the kernel's permission, obtained by adding a
virtual page, marked "not present" to your tasks page-tables, control
will return to your process without even finding some physical memory
for that page. This saves a lot of time and a lot of memory.

If your task were to attempt to access that memory, on a page-by-page
basis, the kernel will fault in new free pages so you end up with
"real RAM pages" at virtual addresses. When RAM gets scarce, the kernel
will try to find free pages by writing some pages of sleeping tasks
to the swap-file or device. The kernel keeps track of where it has
put the contents of the page it just freed. It also marks the PTE for
that task to "page-not-present" so if the sleeping task wakes up
and needs that page, its page-fault will result in the kernel stealing
a new page from somebody else, writing the previous contents to it from
swap, then attaching the new page to the processes PTE before giving
control back to that process.

This is all Virtual Memory 101. It's kind of how it really has to be
done. Different systems have different strokes, but the result is all
the same.

The problem with many Unix's, including Linux, is that the page-stealing
is done by stealing pages from sleeping tasks. The idea is that many are
daemons that may never wake up. Sooner or later, somebody will implement
a page-stealing algorithm that steals pages from the task that keeps
demanding more and more virtual memory. This will allow a memory hog
to continue without disrupting the entire system.

The memory-hog will get slower and slower because its pages keep getting
faulted in and out of disk storage. This will allow well designed programs
to run well and poorly designed programs to run poorly.

Dick Johnson

Penguin : Linux version 2.3.41 on an i686 machine (800.63 BogoMips).

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
Please read the FAQ at

This archive was generated by hypermail 2b29 : Thu Mar 23 2000 - 21:00:19 EST