Re: [Question - ARM CCA] vCPU Hotplug Support in ARM Realm world might require ARM spec change?

From: Suzuki K Poulose
Date: Tue Jul 25 2023 - 07:20:09 EST


Hi Salil

On 25/07/2023 01:05, Salil Mehta wrote:
Hi Suzuki,
Sorry for replying late as I was on/off last week to undergo some medical test.


From: Suzuki K Poulose <suzuki.poulose@xxxxxxx>
Sent: Monday, July 24, 2023 5:27 PM

Hi Salil

On 19/07/2023 10:28, Suzuki K Poulose wrote:
Hi Salil

Thanks for raising this.

On 19/07/2023 03:35, Salil Mehta wrote:
[Reposting it here from Linaro Open Discussion List for more eyes to
look at]

Hello,
I have recently started to dabble with ARM CCA stuff and check if our
recent changes to support vCPU Hotplug in ARM64 can work in the realm
world. I have realized that in the RMM specification[1] PSCI_CPU_ON
command(B5.3.3) does not handles the PSCI_DENIED return code(B5.4.2),
from the host. This might be required to support vCPU Hotplug feature
in the realm world in future. vCPU Hotplug is an important feature to
support kata-containers in realm world as it reduces the VM boot time
and facilitates dynamic adjustment of vCPUs (which I think should be
true even with Realm world as current implementation only makes use
of the PSCI_ON/OFF to realize the Hotplug look-like effect?)


As per our recent changes [2], [3] related to support vCPU Hotplug on
ARM64, we handle the guest exits due to SMC/HVC Hypercall in the
user-space i.e. VMM/Qemu. In realm world, REC Exits to host due to
PSCI_CPU_ON should undergo similar policy checks and I think,

1. Host should *deny* to online the target vCPUs which are NOT plugged
2. This means target REC should be denied by host. Can host call
    RMI_PSCI_COMPETE in such s case?
3. The *return* value (B5.3.3.1.3 Output values) should be PSCI_DENIED

The Realm exit with EXIT_PSCI already provides the parameters passed
onto the PSCI request. This happens for all PSCI calls except
(PSCI_VERSION and PSCI_FEAUTRES). The hyp could forward these exits to
the VMM and could invoke the RMI_PSCI_COMPLETE only when the VMM blesses
the request (wherever applicable).

However, the RMM spec currently doesn't allow denying the request.
i.e., without RMI_PSCI_COMPLETE, the REC cannot be scheduled back in.
We will address this in the RMM spec and get back to you.

This is now resolved in RMMv1.0-eac3 spec, available here [0].

This allows the host to DENY a PSCI_CPU_ON request. The RMM ensures that
the response doesn't violate the security guarantees by checking the
state of the target REC.

[0] https://developer.arm.com/documentation/den0137/latest/


Many thanks for taking this up proactively and getting it done as well
very efficiently. Really appreciate this!

I acknowledge below new changes part of the newly released RMM
Specification [3] (Page-2) (Release Information 1.0-eac3 20-07-2023):

1. Addition of B2.19 PsciReturnCodePermitted function [3] (Page-126)
2. Addition of 'status' in B3.3.7.2 Failure conditions of the
B3.3.7 RMI_PSCI_COMPLETE command [3] (Page-160)


Some Further Suggestions:
1. It would be really helpful if PSCI_DENIED can be accommodated somewhere
in the flow diagram (D1.4.1 PSCI_CPU_ON flow) [3] (Page-297) as well.

Good point, yes, will get that added.

2. You would need changes to handle the return value of the PSCI_DENIED
in this below patch [2] as well from ARM CCA series [1]


Of course. Please note that the series [1] is based on RMMv1.0-beta0 and
we are in the process of rebasing our changes to v1.0-eac3, which
includes a lot of other changes. The updated series will have all the
required changes.

Kind regards
Suzuki



@James, Any further thoughts on this?


References:
[1] [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM
[2] [RFC PATCH 19/28] KVM: arm64: Validate register access for a Realm VM
https://lore.kernel.org/lkml/20230127112248.136810-1-suzuki.poulose@xxxxxxx/T/#m6c10b9a27c4a724967c1800facacaa9443b38b4c
[3] ARM Realm Management Monitor specification(DEN0137 1.0-eac3)
https://developer.arm.com/documentation/den0137/latest/

Thanks
Salil.


4. Failure condition (B5.3.3.2) should be amended with
    runnable pre: target_rec.flags.runnable == NOT_RUNNABLE (?)
             post: result == PSCI_DENIED (?)
5. Change would also be required in the flow (D1.4 PSCI flows) depicting
    PSCI_CPU_ON flow (D1.4.1)

I do understand that ARM CCA support is in its infancy stage and
discussing about vCPU Hotplug in realm world seem to be a far-fetched
idea right now. But specification changes require lot of time and if
this change is really required then it should be further discussed
within ARM.

Many thanks!


Bes regards
Salil


References:

[1] https://developer.arm.com/documentation/den0137/latest/
[2] https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v1-port11052023.dev-1
[3] https://git.gitlab.arm.com/linux-arm/linux-jm.git virtual_cpu_hotplug/rfc/v2