Re: [RFC] Unify KVM kernel-space and user-space code into a singleproject

From: Avi Kivity
Date: Mon Mar 22 2010 - 16:42:26 EST


On 03/22/2010 10:29 PM, Ingo Molnar wrote:
* Avi Kivity<avi@xxxxxxxxxx> wrote:

I think you didnt understand my point. I am talking about 'perf kvm top'
hanging if Qemu hangs.
Use non-blocking I/O, report that guest as dead. No point in profiling it,
it isn't making any progress.
Erm, at what point do i decide that a guest is 'dead' versus 'just lagged due
to lots of IO' ?

qemu shouldn't block due to I/O (it does now, but there is work to fix it). Of course it could be swapping or other things.

Pick a timeout, everything we do has timeouts these days. It's the price we pay for protection: if you put something where a failure can't hurt you, you have to be prepared for failure, and you might have false alarms.

Is it so horrible for 'perf kvm top'? No user data loss will happen, surely?

On the other hand, if it's in the kernel and it fails, you will lose service or perhaps data.

Also, do you realize that you increase complexity (the use of non-blocking
IO), just to protect against something that wouldnt happen if the right
solution was used in the first place?

It's a tradeoff. Increasing the kernel code size vs. increasing userspace size.

With a proper in-kernel enumeration the kernel would always guarantee the
functionality, even if the vcpu does not make progress (i.e. it's "hung").

With this implemented in Qemu we lose that kind of robustness guarantee.
If qemu has a bug in the resource enumeration code, you can't profile one
guest. If the kernel has a bug in the resource enumeration code, the system
either panics or needs to be rebooted later.
This is really simple code, not rocket science. If there's a bug in it we'll
fix it. On the other hand a 500KLOC+ piece of Qemu code has lots of places to
hang, so that is a large cross section.


The kernel has tons of very simple code (and some very complex code as well), and tons of -stable updates as well.

--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/