Re: [RFC] Unify KVM kernel-space and user-space code into a singleproject

From: Avi Kivity
Date: Mon Mar 22 2010 - 02:36:39 EST


On 03/21/2010 11:20 PM, Ingo Molnar wrote:
* Avi Kivity<avi@xxxxxxxxxx> wrote:

Well, for what it's worth, I rarely ever use anything else. My virtual
disks are raw so I can loop mount them easily, and I can also switch my
guest kernels from outside... without ever needing to mount those disks.
Curious, what do you use them for?

btw, if you build your kernel outside the guest, then you already have
access to all its symbols, without needing anything further.
There's two errors with your argument:

1) you are assuming that it's only about kernel symbols

Look at this 'perf report' output:

# Samples: 7127509216
#
# Overhead Command Shared Object Symbol
# ........ .......... ............................. ......
#
19.14% git git [.] lookup_object
15.16% perf git [.] lookup_object
4.74% perf libz.so.1.2.3 [.] inflate
4.52% git libz.so.1.2.3 [.] inflate
4.21% perf libz.so.1.2.3 [.] inflate_table
3.94% git libz.so.1.2.3 [.] inflate_table
3.29% git git [.] find_pack_entry_one
3.24% git libz.so.1.2.3 [.] inflate_fast
2.96% perf libz.so.1.2.3 [.] inflate_fast
2.96% git git [.] decode_tree_entry
2.80% perf libc-2.11.90.so [.] __strlen_sse42
2.56% git libc-2.11.90.so [.] __strlen_sse42
1.98% perf libc-2.11.90.so [.] __GI_memcpy
1.71% perf git [.] decode_tree_entry
1.53% git libc-2.11.90.so [.] __GI_memcpy
1.48% git git [.] lookup_blob
1.30% git git [.] process_tree
1.30% perf git [.] process_tree
0.90% perf git [.] tree_entry
0.82% perf git [.] lookup_blob
0.78% git [kernel.kallsyms] [k] kstat_irqs_cpu

kernel symbols are only a small portion of the symbols. (a single line in this
case)

To get to those other symbols we have to read the ELF symbols of those
binaries in the guest filesystem, in the post-processing/reporting phase. This
is both complex to do and relatively slow so we dont want to (and cannot) do
this at sample time from IRQ context or NMI context ...

Okay. So a symbol server is necessary. Still, I don't think -kernel is a good reason for including the symbol server in the kernel itself. If someone uses it extensively together with perf, _and_ they can't put the symbol server in the guest for some reason, let them patch mkinitrd to include it.

Also, many aspects of reporting are interactive so it's done lazily or
on-demand. So we need ready access to the guest filesystem - for those guests
which decide to integrate with the host for this.

2) the 'SystemTap mistake'

You are assuming that the symbols of the kernel when it got built got saved
properly and are discoverable easily. In reality those symbols can be erased
by a make clean, can be modified by a new build, can be misplaced and can
generally be hard to find because each distro puts them in a different
installation path.

My 10+ years experience with kernel instrumentation solutions is that
kernel-driven, self-sufficient, robust, trustable, well-enumerated sources of
information work far better in practice.

What about line number information? And the source? Into the kernel with them as well?


The thing is, in this thread i'm forced to repeat the same basic facts again
and again. Could you _PLEASE_, pretty please, when it comes to instrumentation
details, at least _read the mails_ of the guys who actually ... write and
maintain Linux instrumentation code? This is getting ridiculous really.

I've read every one of your emails. If I misunderstood or overlooked something, I apologize. The thread is very long and at times antagonistic so it's hard to keep all the details straight.

--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/