Re: How much of a mess does OpenVZ make? ;) Was: What can OpenVZdo?

From: Ingo Molnar
Date: Sat Mar 14 2009 - 04:27:09 EST



* Alexey Dobriyan <adobriyan@xxxxxxxxx> wrote:

> On Fri, Mar 13, 2009 at 02:01:50PM -0700, Linus Torvalds wrote:
> >
> >
> > On Fri, 13 Mar 2009, Alexey Dobriyan wrote:
> > > >
> > > > Let's face it, we're not going to _ever_ checkpoint any
> > > > kind of general case process. Just TCP makes that
> > > > fundamentally impossible in the general case, and there
> > > > are lots and lots of other cases too (just something as
> > > > totally _trivial_ as all the files in the filesystem
> > > > that don't get rolled back).
> > >
> > > What do you mean here? Unlinked files?
> >
> > Or modified files, or anything else. "External state" is a
> > pretty damn wide net. It's not just TCP sequence numbers and
> > another machine.
>
> I think (I think) you're seriously underestimating what's
> doable with kernel C/R and what's already done.
>
> I was told (haven't seen it myself) that Oracle installations
> and Counter Strike servers were moved between boxes just fine.
>
> They were run in specially prepared environment of course, but
> still.

That's the kind of stuff i'd like to see happen.

Right now the main 'enterprise' approach to do
migration/consolidation of server contexts is based on hardware
virtualization - but that pushes runtime overhead to the native
kernel and slows down the guest context as well - massively so.

Before we've blinked twice it will be a 'required' enterprise
feature and enterprise people will measure/benchmark Linux
server performance in guest context primarily and we'll have a
deep performance pit to dig ourselves out of.

We can ignore that trend as uninteresting (it is uninteresting
in a number of ways because it is partly driven by stupidity),
or we can do something about it while still advancing the
kernel.

With containers+checkpointing the code is a lot scarier (we
basically do system call virtualization), the environment
interactions are a lot wider and thus they are a lot more
difficult to handle - but it's all a lot faster as well, and
conceptually so. All the runtime overhead is pushed to the
checkpointing step - (with some minimal amount of data structure
isolation overhead).

I see three conceptual levels of virtualization:

- hardware based virtualization, for 'unaware OSs'

- system call based virtualization, for 'unaware software'

- no virtualization kernel help is needed _at all_ to
checkpoint 'aware' software. We have libraries to checkpoint
'aware' user-space just fine - and had them for a decade.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/