RE: [Question - ARM CCA] vCPU Hotplug Support in ARM Realm world might require ARM spec change?

From: Salil Mehta
Date: Thu Jul 27 2023 - 10:24:13 EST


Hi Suzuki,

> From: Suzuki K Poulose <suzuki.poulose@xxxxxxx>
> Sent: Tuesday, July 25, 2023 12:20 PM
>
> Hi Salil
>
> On 25/07/2023 01:05, Salil Mehta wrote:
> > Hi Suzuki,
> > Sorry for replying late as I was on/off last week to undergo some medical test.
> >
> >> From: Suzuki K Poulose <suzuki.poulose@xxxxxxx>
> >> Sent: Monday, July 24, 2023 5:27 PM
> >>
> >> Hi Salil
> >>
> >> On 19/07/2023 10:28, Suzuki K Poulose wrote:
> >>> Hi Salil
> >>>
> >>> Thanks for raising this.
> >>>
> >>> On 19/07/2023 03:35, Salil Mehta wrote:
> >>>> [Reposting it here from Linaro Open Discussion List for more eyes to look at]
> >>>>
> >>>> Hello,
> >>>> I have recently started to dabble with ARM CCA stuff and check if our
> >>>> recent changes to support vCPU Hotplug in ARM64 can work in the realm
> >>>> world. I have realized that in the RMM specification[1] PSCI_CPU_ON
> >>>> command(B5.3.3) does not handles the PSCI_DENIED return code(B5.4.2),
> >>>> from the host. This might be required to support vCPU Hotplug feature
> >>>> in the realm world in future. vCPU Hotplug is an important feature to
> >>>> support kata-containers in realm world as it reduces the VM boot time
> >>>> and facilitates dynamic adjustment of vCPUs (which I think should be
> >>>> true even with Realm world as current implementation only makes use
> >>>> of the PSCI_ON/OFF to realize the Hotplug look-like effect?)
> >>>>
> >>>>
> >>>> As per our recent changes [2], [3] related to support vCPU Hotplug on
> >>>> ARM64, we handle the guest exits due to SMC/HVC Hypercall in the
> >>>> user-space i.e. VMM/Qemu. In realm world, REC Exits to host due to
> >>>> PSCI_CPU_ON should undergo similar policy checks and I think,
> >>>>
> >>>> 1. Host should *deny* to online the target vCPUs which are NOT plugged
> >>>> 2. This means target REC should be denied by host. Can host call
> >>>>     RMI_PSCI_COMPETE in such s case?
> >>>> 3. The *return* value (B5.3.3.1.3 Output values) should be PSCI_DENIED
> >>>
> >>> The Realm exit with EXIT_PSCI already provides the parameters passed
> >>> onto the PSCI request. This happens for all PSCI calls except
> >>> (PSCI_VERSION and PSCI_FEAUTRES). The hyp could forward these exits to
> >>> the VMM and could invoke the RMI_PSCI_COMPLETE only when the VMM blesses
> >>> the request (wherever applicable).
> >>>
> >>> However, the RMM spec currently doesn't allow denying the request.
> >>> i.e., without RMI_PSCI_COMPLETE, the REC cannot be scheduled back in.
> >>> We will address this in the RMM spec and get back to you.
> >>
> >> This is now resolved in RMMv1.0-eac3 spec, available here [0].
> >>
> >> This allows the host to DENY a PSCI_CPU_ON request. The RMM ensures that
> >> the response doesn't violate the security guarantees by checking the
> >> state of the target REC.
> >>
> >> [0] https://developer.arm.com/documentation/den0137/latest/
> >
> >
> > Many thanks for taking this up proactively and getting it done as well
> > very efficiently. Really appreciate this!
> >
> > I acknowledge below new changes part of the newly released RMM
> > Specification [3] (Page-2) (Release Information 1.0-eac3 20-07-2023):
> >
> > 1. Addition of B2.19 PsciReturnCodePermitted function [3] (Page-126)
> > 2. Addition of 'status' in B3.3.7.2 Failure conditions of the
> > B3.3.7 RMI_PSCI_COMPLETE command [3] (Page-160)
> >
> >
> > Some Further Suggestions:
> > 1. It would be really helpful if PSCI_DENIED can be accommodated somewhere
> > in the flow diagram (D1.4.1 PSCI_CPU_ON flow) [3] (Page-297) as well.
>
> Good point, yes, will get that added.


Great. Thanks!


> > 2. You would need changes to handle the return value of the PSCI_DENIED
> > in this below patch [2] as well from ARM CCA series [1]
> >
>
> Of course. Please note that the series [1] is based on RMMv1.0-beta0 and
> we are in the process of rebasing our changes to v1.0-eac3, which
> includes a lot of other changes. The updated series will have all the
> required changes.


Ok. When are you planning to post this new series with v1.0-eac3 changes?


Thanks
Salil.

> Kind regards
> Suzuki
>
> > @James, Any further thoughts on this?
> >
> >
> > References:
> > [1] [RFC PATCH 00/28] arm64: Support for Arm CCA in KVM
> > [2] [RFC PATCH 19/28] KVM: arm64: Validate register access for a Realm VM
> > https://lore.kernel.org/lkml/20230127112248.136810-1-suzuki.poulose@xxxxxxx/T/#m6c10b9a27c4a724967c1800facacaa9443b38b4c
> > [3] ARM Realm Management Monitor specification(DEN0137 1.0-eac3)
> > https://developer.arm.com/documentation/den0137/latest/
> >
> >
> > Thanks
> > Salil.
> >
> >
> >>>> 4. Failure condition (B5.3.3.2) should be amended with
> >>>>     runnable pre: target_rec.flags.runnable == NOT_RUNNABLE (?)
> >>>>              post: result == PSCI_DENIED (?)
> >>>> 5. Change would also be required in the flow (D1.4 PSCI flows) depicting
> >>>>     PSCI_CPU_ON flow (D1.4.1)
> >>>>
> >>>> I do understand that ARM CCA support is in its infancy stage and
> >>>> discussing about vCPU Hotplug in realm world seem to be a far-fetched
> >>>> idea right now. But specification changes require lot of time and if
> >>>> this change is really required then it should be further discussed
> >>>> within ARM.
> >>>>
> >>>> Many thanks!
> >>>>
> >>>>
> >>>> Bes regards
> >>>> Salil
> >>>>
> >>>> References:
> >>>>
> >>>> [1] https://developer.arm.com/documentation/den0137/latest/
> >>>> [2] https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v1-port11052023.dev-1
> >>>> [3] https://git.gitlab.arm.com/linux-arm/linux-jm.git virtual_cpu_hotplug/rfc/v2