Re: [PATCH 23/32] KVM: s390: pci: handle refresh of PCI translations

From: Niklas Schnelle
Date: Fri Dec 17 2021 - 04:41:22 EST


On Thu, 2021-12-16 at 09:51 -0500, Matthew Rosato wrote:
> On 12/16/21 9:39 AM, Niklas Schnelle wrote:
> > On Tue, 2021-12-14 at 12:54 -0500, Matthew Rosato wrote:
> > > On 12/14/21 11:59 AM, Pierre Morel wrote:
> > > > On 12/7/21 21:57, Matthew Rosato wrote:
> > > > > Add a routine that will perform a shadow operation between a guest
> > > > > and host IOAT. A subsequent patch will invoke this in response to
> > > > > an 04 RPCIT instruction intercept.
> > > > >
> > > > > Signed-off-by: Matthew Rosato <mjrosato@xxxxxxxxxxxxx>
> > > > > ---
> > > > > arch/s390/include/asm/kvm_pci.h | 1 +
> > > > > arch/s390/include/asm/pci_dma.h | 1 +
> > > > > arch/s390/kvm/pci.c | 191 ++++++++++++++++++++++++++++++++
> > > > > arch/s390/kvm/pci.h | 4 +-
> > > > > 4 files changed, 196 insertions(+), 1 deletion(-)
> > > > >
> > ---8<---
> > > > > +
> > > > > +int kvm_s390_pci_refresh_trans(struct kvm_vcpu *vcpu, unsigned long req,
> > > > > + unsigned long start, unsigned long size)
> > > > > +{
> > > > > + struct zpci_dev *zdev;
> > > > > + u32 fh;
> > > > > + int rc;
> > > > > +
> > > > > + /* If the device has a SHM bit on, let userspace take care of
> > > > > this */
> > > > > + fh = req >> 32;
> > > > > + if ((fh & aift.mdd) != 0)
> > > > > + return -EOPNOTSUPP;
> > > >
> > > > I think you should make this check in the caller.
> > >
> > > OK
> > >
> > > > > +
> > > > > + /* Make sure this is a valid device associated with this guest */
> > > > > + zdev = get_zdev_by_fh(fh);
> > > > > + if (!zdev || !zdev->kzdev || zdev->kzdev->kvm != vcpu->kvm)
> > > > > + return -EINVAL;
> > > > > +
> > > > > + /* Only proceed if the device is using the assist */
> > > > > + if (zdev->kzdev->ioat.head[0] == 0)
> > > > > + return -EOPNOTSUPP;
> > > >
> > > > Using the assist means using interpretation over using interception and
> > > > legacy vfio-pci. right?
> > >
> > > Right - more specifically that the IOAT assist feature was never set via
> > > the vfio feature ioctl, so we can't handle the RPCIT for this device and
> > > so throw to userspace.
> > >
> > > The way the QEMU series is being implemented, a device using
> > > interpretation will always have the IOAT feature set on.
> > >
> > > > > +
> > > > > + rc = dma_table_shadow(vcpu, zdev, start, size);
> > > > > + if (rc > 0)
> > > > > + rc = zpci_refresh_trans((u64) zdev->fh << 32, start, size);
> > > >
> > > > Here you lose the status reported by the hardware.
> > > > You should directly use __rpcit(fn, addr, range, &status);
> > >
> > > OK, I can have a look at doing this.
> > >
> > > @Niklas thoughts on how you would want this exported. Renamed to
> > > zpci_rpcit or so?
> >
> > Hmm with using __rpcit() directly we would lose the error reporting in
> > s390dbf and this ist still kind of a RPCIT in the host. How about we
> > add the status as an out parameter to zpci_refresh_trans()? But yes if
>
> Another advantage of doing this would be that we then also keep the cc2
> retry logic in zpci_refresh_trans(), which would be nice.

Yeah thought about that too. If we don't have that I believe the guest
would retry but that means doing two full intercepts and going through
all the other logic too. Since these retries are afaik extremely rare
it shouldn't matter much but on the other hand I would expect them to
only happen when the system is overloaded and then doing all this extra
work surely isn't helpful.

>
> However we do still lose the returned CC value from the instruction.
> But I think we can infer a CC1 from a nonzero status and a CC3 from a
> zero status so maybe this is OK too.

I agree.