Re: [PATCH] Enhance perf to collect KVM guest os statistics fromhost side

From: Anthony Liguori
Date: Tue Mar 16 2010 - 19:04:28 EST


On 03/16/2010 01:28 PM, Ingo Molnar wrote:
* Anthony Liguori<aliguori@xxxxxxxxxxxxxxxxxx> wrote:

On 03/16/2010 12:52 PM, Ingo Molnar wrote:
* Anthony Liguori<aliguori@xxxxxxxxxxxxxxxxxx> wrote:

On 03/16/2010 10:52 AM, Ingo Molnar wrote:
You are quite mistaken: KVM isnt really a 'random unprivileged application' in
this context, it is clearly an extension of system/kernel services.

( Which can be seen from the simple fact that what started the discussion was
'how do we get /proc/kallsyms from the guest'. I.e. an extension of the
existing host-space /proc/kallsyms was desired. )
Random tools (like perf) should not be able to do what you describe. It's a
security nightmare.
A security nightmare exactly how? Mind to go into details as i dont understand
your point.
Assume you're using SELinux to implement mandatory access control.
How do you label this file system?

Generally speaking, we don't know the difference between /proc/kallsyms vs.
/dev/mem if we do generic passthrough. While it might be safe to have a
relaxed label of kallsyms (since it's read only), it's clearly not safe to
do that for /dev/mem, /etc/shadow, or any file containing sensitive
information.
What's your _point_? Please outline a threat model, a vector of attack,
_anything_ that substantiates your "it's a security nightmare" claim.

You suggested "to have a (read only) mount of all guest filesystems".

As I described earlier, not all of the information within the guest filesystem has the same level of sensitivity. If you exposed a generic interface like this, it makes it very difficult to delegate privileges.

Delegating privileges is important because from in a higher security environment, you may want to prevent a management tool from accessing the VM's disk directly, but still allow it to do basic operations (in particular, to view performance statistics).

Rather, we ought to expose a higher level interface that we have more
confidence in with respect to understanding the ramifications of exposing
that guest data.
Exactly, we want something that has a flexible namespace and works well with
Linux tools in general. Preferably that namespace should be human readable,
and it should be hierarchic, and it should have a well-known permission model.

This concept exists in Linux and is generally called a 'filesystem'.

If you want to use a synthetic filesystem as the management interface for qemu, that's one thing. But you suggested exposing the guest filesystem in its entirely and that's what I disagreed with.

If a user cannot read the image file then the user has no access to its
contents via other namespaces either. That is, of course, a basic security
aspect.

( That is perfectly true with a non-SELinux Unix permission model as well, and
is true in the SELinux case as well. )

I don't think that's reasonable at all. The guest may encrypt it's disk image. It still ought to be possible to run perf against that guest, no?

Erm. Please explain to me, what exactly is 'not that simple' in a MAC
environment?

Also, i'd like to note that the 'restrictive SELinux setups' usecases are
pretty secondary.

To demonstrate that, i'd like every KVM developer on this list who reads this
mail and who has their home development system where they produce their
patches set up in a restrictive MAC environment, in that you cannot even read
the images you are using, to chime in with a "I'm doing that" reply.

My home system doesn't run SELinux but I work daily with systems that are using SELinux.

I want to be able to run tools like perf on these systems because ultimately, I need to debug these systems on a daily basis.

But that's missing the point. We want to have an interface that works for both cases so that we're not maintaining two separate interfaces.

We've rat holed a bit though. You want:

1) to run perf kvm list and be able to enumerate KVM guests

2) for this to Just Work with qemu guests launched from the command line

You could achieve (1) by tying perf to libvirt but that won't work for (2). There are a few practical problems with (2).

qemu does not require the user to associate any uniquely identifying information with a VM. We've also optimized the command line use case so that if all you want to do is run a disk image, you just execute "qemu foo.img". To satisfy your use case, we would either have to force a use to always specify unique information, which would be less convenient for our users or we would have to let the name be an optional parameter.

As it turns out, we already support "qemu -name Fedora foo.img". What we don't do today, but I've been suggesting we should, is automatically create a QMP management socket in a well known location based on the -name parameter when it's specified. That would let a tool like perf Just Work provided that a user specified -name.

No one uses -name today though and I'm sure you don't either.

The only way to really address this is to change the interaction. Instead of running perf externally to qemu, we should support a perf command in the qemu monitor that can then tie directly to the perf tooling. That gives us the best possible user experience.

We can't do that though unless perf is a library or is in some way more programmatic.

Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/