Re: Avoiding OOM on overcommit...?

From: Marco Colombo (marco@esi.it)
Date: Mon Mar 27 2000 - 10:34:39 EST


On Mon, 27 Mar 2000, Jesse Pollard wrote:

> Marco Colombo <marco@esi.it>
> > On Sun, 26 Mar 2000, Linda Walsh wrote:
> >
> > > David Whysong wrote:
> > > > If you run out of a resource, the system should not crash. The kernel just
> > > > has to free up the resource. A convenient way of doing that is to kill a
> > > > user process.
> > > ---
> > > Then you have violated the integrity of the user-process space.
> > > Tell me, which processes are killed when the system runs out file descriptors?
> > > How about processes? Disk space? Why are you treating memory differently?
> >
> > Because overcommitting swap space is possible and it works (higher
> > system throughput, on average case).
> >
> > Please explain how you'd implement 'overcommitting' of PIDs, or open files.
>
> Silly:
> pids - allocate one page to the pid table - tell kernel there
> are an infinite number of pids, but only allocate one page -
> why allocate more, they don't all get used....
> files - don't allocate a file table, do just like above with pids...
>
> BTW, pids are overcommited. Originally they were equivalent to "slot". The
> above discription for pids also applies to slots.

You misunderstand what names and objects are (at least what i mean...).
PID "numbers" are (in a way) overcommitted. PID 30000 is valid does not
mean there may be 30000 processes on the system.
Valid PIDs are never 'overcommitted': pid = fork() always returns
a pid that is valid, and points to an "object" (process) which has all
necessary resources allocated to. You never get a pid, with the kernel
NOT creating the object until you access the pid (COW = clone on write?).

With 31bit PIDs, the are 32768 *names* for processes. No relation with
NPROC (or whatever you call maximum number of system processes).

A virtual address is just a name. brk() gives you more names. No more,
no less. Objects (page-frames, disk blocks) are allocated *later*,
when you refer to them, depending on how you're referring to them. On
a COW page, a read access won't allocate any object (page-frame).

> > And BTW, the system is not 'overcommitting' anything. The kernel is just
> > extending VA space of a process. It's giving you more "names". It's not
> > giving you any object. VA space is not "memory".
>
> To the user porcess it is memory. They are pages of real storage locations.
> NOT the same as "names".

Addresses *are* names. By definition. Virtual addresses point to "pages",
which in turn are logical objects. Some real "storage" (a page-frame or
a disk block) may or may not be mapped to a page.

> > Offsets in a file are just "names". That's why you can write one block
> > at 1GB offset, without allocating 1GB of disk space. Is this overcommiting
> > of FS space? Just because the system lets you create a file which is
> > 1GB in *size* (but not in disk usage), it doesn't mean you have 1GB of
> > free disk space. And just because the system lets you malloc(1GB) of
> > address space, and even lets you successfully write at 1GB-1 offset,
> > it doesn't mean there's 1GB of free RAM or swap. Why are you treating
> > memory differently from disk space?
>
> It isn't being treated differently. A sparce file is still limited by
> the users disk quota (which measures what is actually used...) When the
> user exceeds quota the user is aborted. Just because one user reached quota
> limit, is no reason to abort all users of the disk, or even to abort a
> random user....

We're not discussing quotas here. You can have quotas with or without
swap overcommiting. And you can have quotas on 'names' or on 'objects'.
Not all namespaces are infinite. Given a limit of 3GB as address space,
you may be unable to access 8GB of RAM. You'll run out of 'names' before
running out of 'objects'.

> If there are no quotas on disk then the last process that attempts a write
> will get aborted - the random process... This is the same as for memory without
> quotas.
>
> Disk quotas serve the same purpose that memory quotas do. They provide
> for equitibly sharing access while providing security to the space assigned
> can be used by the user/process that it was assigned to.
>
> -------------------------------------------------------------------------
> Jesse I Pollard, II
> Email: pollard@navo.hpc.mil
>
> Any opinions expressed are solely my own.
>

.TM.

-- 
      ____/  ____/   /
     /      /       /			Marco Colombo
    ___/  ___  /   /		      Technical Manager
   /          /   /			 ESI s.r.l.
 _____/ _____/  _/		       Colombo@ESI.it

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Mar 31 2000 - 21:00:19 EST