Re: [PATCH v9 09/17] x86/split_lock: Handle #AC exception for split lock

From: Paolo Bonzini
Date: Wed Oct 16 2019 - 10:08:21 EST


On 16/10/19 15:51, Xiaoyao Li wrote:
> On 10/16/2019 7:58 PM, Paolo Bonzini wrote:
>> On 16/10/19 13:49, Thomas Gleixner wrote:
>>> On Wed, 16 Oct 2019, Paolo Bonzini wrote:
>>>> Yes it does. But Sean's proposal, as I understand it, leads to the
>>>> guest receiving #AC when it wasn't expecting one. So for an old guest,
>>>> as soon as the guest kernel happens to do a split lock, it gets an
>>>> unexpected #AC and crashes and burns. And then, after much googling
>>>> and
>>>> gnashing of teeth, people proceed to disable split lock detection.
>>>
>>> I don't think that this was what he suggested/intended.
>>
>> Xiaoyao's reply suggests that he also understood it like that.
>
> Actually, what I replied is a little different from what you stated
> above that guest won't receive #AC when it wasn't expecting one but the
> userspace receives this #AC.

Okay---but userspace has no choice but to crash the guest, which is okay
for debugging but, most likely, undesirable behavior in production.

>>> With your proposal you render #AC useless even on hosts which have SMT
>>> disabled, which is just wrong. There are enough good reasons to disable
>>> SMT.
>>
>> My lazy "solution" only applies to SMT enabled. When SMT is either not
>> supported, or disabled as in "nosmt=force", we can virtualize it like
>> the posted patches have done so far.
>
> Do we really need to divide it into two cases of SMT enabled and SMT
> disabled?

Yes, absolutely. Because in one case MSR_TEST_CTRL behaves sanely, in
the other it doesn't.

>> Yes, that's a valid alternative. But if SMT is possible, I think the
>> only sane possibilities are global disable and SIGBUS. SIGBUS (or
>> better, a new KVM_RUN exit code) can be acceptable for debugging
>> guests too.
>
> If SIGBUS, why need to globally disable?

SIGBUS (actually a new KVM_EXIT_INTERNAL_ERROR result from KVM_RUN is
better, but that's the idea) is for when you're debugging guests.
Global disable (or alternatively, disable SMT) is for production use.

> When there is an #AC due to split-lock in guest, KVM only has below two
> choices:
> 1) inject back into guest.
> ÂÂ - If kvm advertise this feature to guest, and guest kernel is latest,
> and guest kernel must enable it too. It's the happy case that guest can
> handler it on its own purpose.
> ÂÂ - Any other cases, guest get an unexpected #AC and crash.
> 2) report to userspace (I think the same like a SIGBUS)
>
> So for simplicity, we can do what Paolo suggested that don't advertise
> this feature and report #AC to userspace when an #AC due to split-lock
> in guest *but* we never disable the host's split-lock detection due to
> guest's split-lock.

This is one possibility, but it must be opt-in. Either you make split
lock detection opt-in in the host (and then a userspace exit is okay),
or you make split lock detection opt-in for KVM (and then #AC causes a
global disable of split-lock detection on the host).

Breaking all old guests with the default options is not a valid choice.

Paolo