Re: [PATCH v1 1/2] s390x: KVM: accept STSI for CPU topology information

From: Cornelia Huck
Date: Thu Jul 15 2021 - 05:31:08 EST


On Thu, Jul 15 2021, David Hildenbrand <david@xxxxxxxxxx> wrote:

> On 14.07.21 17:25, Pierre Morel wrote:
>> STSI(15.1.x) gives information on the CPU configuration topology.
>> Let's accept the interception of STSI with the function code 15 and
>> let the userland part of the hypervisor handle it.
>>
>> Signed-off-by: Pierre Morel <pmorel@xxxxxxxxxxxxx>
>> ---
>> arch/s390/kvm/priv.c | 11 ++++++++++-
>> 1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
>> index 9928f785c677..4ab5f8b7780e 100644
>> --- a/arch/s390/kvm/priv.c
>> +++ b/arch/s390/kvm/priv.c
>> @@ -856,7 +856,7 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>> if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
>> return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
>>
>> - if (fc > 3) {
>> + if (fc > 3 && fc != 15) {
>> kvm_s390_set_psw_cc(vcpu, 3);
>> return 0;
>> }
>> @@ -893,6 +893,15 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>> goto out_no_data;
>> handle_stsi_3_2_2(vcpu, (void *) mem);
>> break;
>> + case 15:
>> + if (sel1 != 1 || sel2 < 2 || sel2 > 6)
>> + goto out_no_data;
>> + if (vcpu->kvm->arch.user_stsi) {
>> + insert_stsi_usr_data(vcpu, operand2, ar, fc, sel1, sel2);
>> + return -EREMOTE;

This bypasses the trace event further down.

>> + }
>> + kvm_s390_set_psw_cc(vcpu, 3);
>> + return 0;
>> }
>> if (kvm_s390_pv_cpu_is_protected(vcpu)) {
>> memcpy((void *)sida_origin(vcpu->arch.sie_block), (void *)mem,
>>
>
> 1. Setting GPRS to 0
>
> I was wondering why we have the "vcpu->run->s.regs.gprs[0] = 0;"
> for existing fc 1,2,3 in case we set cc=0.
>
> Looking at the doc, all I find is:
>
> "CC 0: Requested configuration-level number placed in
> general register 0 or requested SYSIB informa-
> tion stored"
>
> But I don't find where it states that we are supposed to set
> general register 0 to 0. Wouldn't we also have to do it for
> fc=15 or for none?
>
> If fc 1,2,3 and 15 are to be handled equally, I suggest the following:
>
> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
> index 9928f785c677..6eb86fa58b0b 100644
> --- a/arch/s390/kvm/priv.c
> +++ b/arch/s390/kvm/priv.c
> @@ -893,17 +893,23 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
> goto out_no_data;
> handle_stsi_3_2_2(vcpu, (void *) mem);
> break;
> + case 15:
> + if (sel1 != 1 || sel2 < 2 || sel2 > 6)
> + goto out_no_data;
> + break;
> }
> - if (kvm_s390_pv_cpu_is_protected(vcpu)) {
> - memcpy((void *)sida_origin(vcpu->arch.sie_block), (void *)mem,
> - PAGE_SIZE);
> - rc = 0;
> - } else {
> - rc = write_guest(vcpu, operand2, ar, (void *)mem, PAGE_SIZE);
> - }
> - if (rc) {
> - rc = kvm_s390_inject_prog_cond(vcpu, rc);
> - goto out;
> + if (mem) {
> + if (kvm_s390_pv_cpu_is_protected(vcpu)) {
> + memcpy((void *)sida_origin(vcpu->arch.sie_block),
> + (void *)mem, PAGE_SIZE);
> + } else {
> + rc = write_guest(vcpu, operand2, ar, (void *)mem,
> + PAGE_SIZE);
> + if (rc) {
> + rc = kvm_s390_inject_prog_cond(vcpu, rc);
> + goto out;
> + }
> + }
> }
> if (vcpu->kvm->arch.user_stsi) {
> insert_stsi_usr_data(vcpu, operand2, ar, fc, sel1, sel2);

Something like that sounds good, the code is getting a bit convoluted.

>
>
> 2. maximum-MNest facility
>
> "
> 1. If the maximum-MNest facility is installed and
> selector 2 exceeds the nonzero model-depen-
> dent maximum-selector-2 value."
>
> 2. If the maximum-MNest facility is not installed and
> selector 2 is not specified as two.
> "
>
> We will we be handling the presence/absence of the maximum-MNest facility
> (for our guest?) in QEMU, corect?
>
> I do wonder if we should just let any fc=15 go to user space let the whole
> sel1 / sel2 checking be handled there. I don't think it's a fast path after all.
> But no strong opinion.

If that makes handling easier, I think it would be a good idea.

>
> How do we identify availability of maximum-MNest facility?
>
>
> 3. User space awareness
>
> How can user space identify that we actually forward these intercepts?
> How can it enable them? The old KVM_CAP_S390_USER_STSI capability
> is not sufficient.

Why do you think that it is not sufficient? USER_STSI basically says
"you may get an exit that tells you about a buffer to fill in some more
data for a stsi command, and we also tell you which call". If userspace
does not know what to add for a certain call, it is free to just do
nothing, and if it does not get some calls it would support, that should
not be a problem, either?

>
> I do wonder if we want KVM_CAP_S390_USER_STSI_15 or sth like that to change
> the behavior once enabled by user space.
>
>
> 4. Without vcpu->kvm->arch.user_stsi, we indicate cc=0 to our guest,
> also for fc 1,2,3. Is that actually what we want? (or do we simply not care
> because the guest is not supposed to use stsi?)

If returning an empty buffer is ok, it should not be a problem, I
guess. (I have not looked yet at the actual definitions.)