Re: Use CPUID to communicate with the hypervisor.

From: H. Peter Anvin
Date: Fri Sep 26 2008 - 21:36:49 EST


Jeremy Fitzhardinge wrote:

I'm sympathetic to the idea, but it seems a bit under-defined.

Are you leaving a gap between 0x40000000 and -10 for what? Future
extension? Avoiding existing hypervisor-specific leaves?

I think there's a move towards doing a scan for a signature, such as
checking every 16 leaves after 0x40000000 for "a while" looking for
interesting signatures, so that a hypervisor can support multiple ABIs
at once. Given this, it would be better to define a "Generic Hypervisor
ABI" signature, and put all the related leaves together.


That's kind of iffy, although at least it does have a modicum of being controlled.

There is already a de facto standard for doing this: on a (currently) 64K boundary, add a leaf with a vendor ID and a limit; the presence is detectable by the limit in EAX having the proper upper bits.

Then have each vendor pick a range that they maintain. Intel uses 0x0000xxxx (although they claim control of the entire numberspace), AMD uses 0x8000xxxx, VIA uses 0xC000xxxx, Transmeta used 0x8086xxxx, and 0x4000xxxx is being reserved for "virtualization". There are tools which use this as a way to try to dump all of CPUID without knowing details.

See the problem here? This is in effect an unmanaged space. This means that without the vendor ID it is going to be meaningless, unless at least the major players in the virtualization industry could agree with how to use it, and that would still leave other users out in the cold.

Now, that would still require a vendor numberspace registry. The obvious one is to use the numbers issued by PCI-SIG, which would require 16 bits -- that would presumably mean numbers of the form 0x40SSSSxx with SSSS being the vendor ID; this would require scanning on a 256-byte granularity for a generic tool.

Overall, though, *any* generic solution requires buyin from all significant players in the space, *AND* a way to distinguish noncompliant implementations. Designing a functional solution is the easy part of that[*]. Getting sufficient buyin in the hard part.

And then, rather than having a simple "maximum leaf", it would be better
to have cap bits for each specific feature. For example, how would the
"RESERVED" registers in "Timing information" ever get used? How would
you know that they were no longer reserved, but now meaningful?

Typically you'd define them to be zero unless usable, and define them so that a meaningful value would be nonzero.

That said, I'm a bit worried about the whole idea of having these kinds
of timing parameters. It does assume that they're constant for the
whole life of the VM. What if they change due to power management or
migration?

Presumably you'd have to have some way to notify the VM, via an interrupt of some sort.

-hpa

[*] Consider the following totally half-baked example:

CPUID leaf 0x40000000
ECX-EDX-EBX Vendor name
EAX Max CPUID level supported

Motivation: existing practice

CPUID leaf 0x40000001...
EAX leaf number Pointer
ECX DID:VID PCI-style
EDX 0xcc06ab0b Magic number
EBX 0x7ab3857a Magic number

This would use the PCI vendor ID and an arbitrary "device ID"
to point to a leaf number, which would then contain information
starting with an identification/count leaf. The DID:VID would
signal who defined the specification, not necessarily who wrote
the hypervisor. This is similar to how Intel uses AMD-defined
CPUID levels, for example.

-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/