Re: [PATCH v10 04/16] s390/zcrypt: driver callback to indicate resource in use

From: Tony Krowiak
Date: Thu Sep 17 2020 - 11:04:01 EST




On 9/17/20 8:14 AM, Cornelia Huck wrote:
On Tue, 15 Sep 2020 15:32:35 -0400
Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote:

On 9/14/20 11:29 AM, Cornelia Huck wrote:
On Fri, 21 Aug 2020 15:56:04 -0400
Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote:
Introduces a new driver callback to prevent a root user from unbinding
an AP queue from its device driver if the queue is in use. The intent of
this callback is to provide a driver with the means to prevent a root user
from inadvertently taking a queue away from a matrix mdev and giving it to
the host while it is assigned to the matrix mdev. The callback will
be invoked whenever a change to the AP bus's sysfs apmask or aqmask
attributes would result in one or more AP queues being removed from its
driver. If the callback responds in the affirmative for any driver
queried, the change to the apmask or aqmask will be rejected with a device
in use error.

For this patch, only non-default drivers will be queried. Currently,
there is only one non-default driver, the vfio_ap device driver. The
vfio_ap device driver facilitates pass-through of an AP queue to a
guest. The idea here is that a guest may be administered by a different
sysadmin than the host and we don't want AP resources to unexpectedly
disappear from a guest's AP configuration (i.e., adapters, domains and
control domains assigned to the matrix mdev). This will enforce the proper
procedure for removing AP resources intended for guest usage which is to
first unassign them from the matrix mdev, then unbind them from the
vfio_ap device driver.

Signed-off-by: Tony Krowiak <akrowiak@xxxxxxxxxxxxx>
Reported-by: kernel test robot <lkp@xxxxxxxxx>
This looks a bit odd...
I've removed all of those. These kernel test robot errors were flagged
in the last series. The review comments from the robot suggested
the reported-by, but I assume that was for patches intended to
fix those errors, so I am removing these as per Christian's comments.
Yes, I think the Reported-by: mostly makes sense if you include a patch
to fix something on top.

---
drivers/s390/crypto/ap_bus.c | 148 ++++++++++++++++++++++++++++++++---
drivers/s390/crypto/ap_bus.h | 4 +
2 files changed, 142 insertions(+), 10 deletions(-)
(...)
@@ -1107,12 +1118,70 @@ static ssize_t apmask_show(struct bus_type *bus, char *buf)
return rc;
}
+static int __verify_card_reservations(struct device_driver *drv, void *data)
+{
+ int rc = 0;
+ struct ap_driver *ap_drv = to_ap_drv(drv);
+ unsigned long *newapm = (unsigned long *)data;
+
+ /*
+ * No need to verify whether the driver is using the queues if it is the
+ * default driver.
+ */
+ if (ap_drv->flags & AP_DRIVER_FLAG_DEFAULT)
+ return 0;
+
+ /* The non-default driver's module must be loaded */
+ if (!try_module_get(drv->owner))
+ return 0;
+
+ if (ap_drv->in_use)
+ if (ap_drv->in_use(newapm, ap_perms.aqm))
+ rc = -EADDRINUSE;
ISTR that Christian suggested -EBUSY in a past revision of this series?
I think that would be more appropriate.
I went back and looked and sure enough, he did recommend that.
You have a great memory! I didn't respond to that comment, so I
must have missed it at the time.

I personally prefer EADDRINUSE because I think it is more indicative
of the reason an AP resource can not be assigned back to the host
drivers is because it is in use by a guest or, at the very least, reserved
for use by a guest (i.e., assigned to an mdev). To say it is busy implies
that the device is busy performing encryption services which may or
may not be true at a given moment. Even if so, that is not the reason
for refusing to allow reassignment of the device.
I have a different understanding of these error codes: EADDRINUSE is
something used in the networking context when an actual address is
already used elsewhere. EBUSY is more of a generic error that indicates
that a certain resource is not free to perform the requested operation;
it does not necessarily mean that the resource is currently actively
doing something. Kind of when you get EBUSY when trying to eject
something another program holds a reference on: that other program
might not actually be doing anything, but it potentially could.

I'll go ahead and change it to -EBUSY.