Re: [PATCH v2 1/2] KVM: arm64: Add handler for MOPS exceptions

From: Kristina Martsenko
Date: Mon Oct 02 2023 - 10:06:52 EST


On 29/09/2023 10:23, Marc Zyngier wrote:
> On Wed, 27 Sep 2023 09:28:20 +0100,
> Oliver Upton <oliver.upton@xxxxxxxxx> wrote:
>>
>> On Mon, Sep 25, 2023 at 04:16:06PM +0100, Kristina Martsenko wrote:
>>
>> [...]
>>
>>>> What is the rationale for advancing the state machine? Shouldn't we
>>>> instead return to the guest and immediately get the SS exception,
>>>> which in turn gets reported to userspace? Is it because we rollback
>>>> the PC to a previous instruction?
>>>
>>> Yes, because we rollback the PC to the prologue instruction. We advance the
>>> state machine so that the SS exception is taken immediately upon returning to
>>> the guest at the prologue instruction. If we didn't advance it then we would
>>> return to the guest, execute the prologue instruction, and then take the SS
>>> exception on the middle instruction. Which would be surprising as userspace
>>> would see the middle and epilogue instructions executed multiple times but not
>>> the prologue.
>>
>> I agree with Kristina that taking the SS exception on the prologue is
>> likely the best course of action. Especially since it matches the
>> behavior of single-stepping an EL0 MOPS sequence with an intervening CPU
>> migration.
>>
>> This behavior might throw an EL1 that single-steps itself for a loop,
>> but I think it is impossible for a hypervisor to hide the consequences
>> of vCPU migration with MOPS in the first place.
>>
>> Marc, I'm guessing you were most concerned about the former case where
>> the VMM was debugging the guest. Is there something you're concerned
>> about I missed?
>
> My concern is not only the VMM, but any userspace that perform
> single-stepping. Imagine the debugger tracks PC by itself, and simply
> increments it by 4 on a non-branch, non-fault instruction.
>
> Move the vcpu or the userspace around, rewind PC, and now the debugger
> is out of whack with what is executing. While I agree that there is
> not much a hypervisor can do about that, I'm a bit worried that we are
> going to break existing SW with this.
>
> Now the obvious solution is "don't do that"...

If the debugger can handle the PC changing on branching or faulting
instructions, then why can't it handle it on MOPS instructions? Wouldn't
such a debugger need to be updated any time the architecture adds new
branching or faulting instructions? What's different here?

Confused,
Kristina