Re: Overcommitable memory??

From: Marco Colombo (marco@esi.it)
Date: Wed Mar 22 2000 - 05:26:10 EST


On Tue, 21 Mar 2000, James Sutherland wrote:

> >We are not short of memory. We're short of swap. paging-out is the problem,
> >not paging-in.
>
> Where did you get that idea? This situation is "out of RAM+swap". No
> RAM free, no swap free, nothing can be allocated. Nothing to do with
> paging, either - you can shuffle the memory around fine, the problem
> is just that there isn't enough to go round.

We're saying the same thing: being out of swap implies being out of RAM.
If you're not out of RAM, the swap area is *empty*. On a Linux system
the swap space is almost never empty (unless it's 0-sized, of course)
because the kerner thinks it's a better idea to page-out some unused pages
to make room for more buffers/cache.

But paging is not "shuffle the memory around": you go either RAM -> disk
or disk -> RAM. In OOS situation, if a free page-frame is needed (because
of a page fault wants to do a "disk -> RAM" operation), AND none is
available (so it's OOM, after all), you first need to do a "RAM -> disk",
which is impossible because swap is full. The problem occurs during
page-out.

> >> IMO, it's much better to get a signal which means "we're getting short
> >> of memory, folks", which can be handled in ONE place, rather than
> >> returning 0 as a pointer - which many apps then try to dereference,
> >> ending up segfaulting themselves anyway.
> >
> >"many apps" do not check malloc() return values? I don't call them apps,
> >I call them first time students exercises. B-)
> >They're are not able to check malloc() return value but they can setup a
> >signal handler that triggers runtime a GC() clever enough not to page
> >fault itself? I don't follow you.
>
> If they don't bother handling errors of either sort, they just get
> terminated with a signal anyway. If they are careful enough to check
> malloc() every time, they would handle SIGBUS as well.

So it doesn't make real difference. Everybody is used to write it's own
safe malloc() wrapper. The problem is: you accessed some page of yours,
and faulted. The kernel needs to page-in your data. To do that it needs
a free page-frame (RAM). There's none. So it looks for an used page-frame
that can be paged-out. It selects one. When it tries to page it out,
it fails, 'cause swap is full. It kills a signal to you process.
How can you be sure that your signal handler it there, *in RAM*, and
won't page fault itself? Expecially if it runs some kind of garbage
collector (which is likely to touch most of your data pages)?
The whole problem arose not at malloc() time, but later when you're
accessing your address space. A page that has been previously allocated
to you (in RAM) when you wrote it, and later (you not being notified)
paged-out to swap.

>
>
> James.
>

.TM.

-- 
      ____/  ____/   /
     /      /       /			Marco Colombo
    ___/  ___  /   /		      Technical Manager
   /          /   /			 ESI s.r.l.
 _____/ _____/  _/		       Colombo@ESI.it

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Mar 23 2000 - 21:00:35 EST