Re: Avoiding OOM on overcommit...?

From: Marco Colombo (marco@esi.it)
Date: Wed Mar 22 2000 - 07:07:25 EST


On Tue, 21 Mar 2000, Jesse Pollard wrote:

> Marco Colombo <marco@esi.it>:
> > On Mon, 20 Mar 2000, Horst von Brand wrote:
> >
> > > Jesse Pollard <pollard@tomcat.admin.navo.hpc.mil> said:
> > >
> > > [...]
> > >
> > > > Without overcommit:
> > > > You can support 100 simultaneous connections, with full saturation of
> > > > each server, with no failures.
> > > >
> > > > Result: happy customers, happy management, maybe even a raise.
> > >
> > > Management nagging about supporting more pages, worried by waste of several
> > > Gb of disk that has never, ever been touched. System is sluggish, needs
> > > constant tweaking of "resource allocation quotas" as applications crash,
> > > even there are resources available. Seriously consider firing inept
> > > sysadmin.
> >
> > Then, the new sysadm turns overcommiting on, and suddenly you realize
> > that you can support 300 simultaneous connections, with REAL full saturation
> > of each server, with no failure. 100% agreed.
>
> And the system crashes just after a flood of DoS connections hit, corrupting the
> accounting database, and 3 days of downtime rebuilding the accounting....
> And hoping nobody stole the credit card file while the system could not perform
> auditing because it got aborted...

Explain what overcommiting has to do with avoiding DoS. Under flood,
your (non overcommiting) system won't serve any legitimate request
(e.g. won't spawn other servers) because of OOS (just like mine).
Unless you're actively protecting against floods, the DoS succeeds.
And if you implement some kind of defence against DoS, I can do also.
The only difference is that you're serving 100 requests, me 300.

If your database leaves behind corrupted data, upgrade to a real
transaction-based system. You should do it anyway.

If someone stole your credit card file, accounting won't help much.
Hire some security expert.

Don't mix things up. Overcommitting systems do not crash (because of
overcommitting). Random crashes are bad, sure. Random crashes on critical
system with no backup, no real database engine, no security help save
the World from clueless sysadmins. This has nothing to do with
overcommitting.

AFAIK, system crashes (with Rik's OMM killer) because of init being
killed are impossible.

> > Either the system is under control, or it's not. A perfectly tuned system
> > never goes OOM or OOS. A badly configured system misperforms no matter of
> > overcommitting being on or off.
>
> YUP. absolutely. Although "perfectly tuned" assumes "perfect users" too.

Againg you're mixing things up. You're right: perfectly tuned ==>
perfect users. Do you run your critical accounting system on the same
Unix box you use to give free Unix accounts to random users?
You shouldn't. And even if you do, do you *really* think that an
overcommitting kernel is your *major* problem? No matter how many different
kind of quotas you set up, I'll never call that a "perfectly tuned" system.

Run your accounting database on a dedicated system. One user only. You.
The more perfect user in the World. Never disagrees with sysadm. B-)
"Connections" are not "users", BTW. If you run a server which has no
means to control its own behaviour, blame the programmer.
And, on the server example, i'd even turn swap off. The system should
not page at all. If it *starts* paging, you're already in trouble.
On high performance systems, memory accesses should be just that.
You don't want to pay the overload of a page-fault. Your server should be
carefully tuned to avoid paging at all. So no OMM.

>
> -------------------------------------------------------------------------
> Jesse I Pollard, II
> Email: pollard@navo.hpc.mil
>
> Any opinions expressed are solely my own.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.rutgers.edu
> Please read the FAQ at http://www.tux.org/lkml/
>

.TM.

-- 
      ____/  ____/   /
     /      /       /			Marco Colombo
    ___/  ___  /   /		      Technical Manager
   /          /   /			 ESI s.r.l.
 _____/ _____/  _/		       Colombo@ESI.it

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Mar 23 2000 - 21:00:36 EST