Re: [PATCH v10 04/16] s390/zcrypt: driver callback to indicate resource in use

From: Tony Krowiak
Date: Tue Sep 15 2020 - 15:36:39 EST




On 9/14/20 11:29 AM, Cornelia Huck wrote:
On Fri, 21 Aug 2020 15:56:04 -0400
Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote:

Introduces a new driver callback to prevent a root user from unbinding
an AP queue from its device driver if the queue is in use. The intent of
this callback is to provide a driver with the means to prevent a root user
from inadvertently taking a queue away from a matrix mdev and giving it to
the host while it is assigned to the matrix mdev. The callback will
be invoked whenever a change to the AP bus's sysfs apmask or aqmask
attributes would result in one or more AP queues being removed from its
driver. If the callback responds in the affirmative for any driver
queried, the change to the apmask or aqmask will be rejected with a device
in use error.

For this patch, only non-default drivers will be queried. Currently,
there is only one non-default driver, the vfio_ap device driver. The
vfio_ap device driver facilitates pass-through of an AP queue to a
guest. The idea here is that a guest may be administered by a different
sysadmin than the host and we don't want AP resources to unexpectedly
disappear from a guest's AP configuration (i.e., adapters, domains and
control domains assigned to the matrix mdev). This will enforce the proper
procedure for removing AP resources intended for guest usage which is to
first unassign them from the matrix mdev, then unbind them from the
vfio_ap device driver.

Signed-off-by: Tony Krowiak <akrowiak@xxxxxxxxxxxxx>
Reported-by: kernel test robot <lkp@xxxxxxxxx>
This looks a bit odd...

I've removed all of those. These kernel test robot errors were flagged
in the last series. The review comments from the robot suggested
the reported-by, but I assume that was for patches intended to
fix those errors, so I am removing these as per Christian's comments.


---
drivers/s390/crypto/ap_bus.c | 148 ++++++++++++++++++++++++++++++++---
drivers/s390/crypto/ap_bus.h | 4 +
2 files changed, 142 insertions(+), 10 deletions(-)

(...)

@@ -1107,12 +1118,70 @@ static ssize_t apmask_show(struct bus_type *bus, char *buf)
return rc;
}
+static int __verify_card_reservations(struct device_driver *drv, void *data)
+{
+ int rc = 0;
+ struct ap_driver *ap_drv = to_ap_drv(drv);
+ unsigned long *newapm = (unsigned long *)data;
+
+ /*
+ * No need to verify whether the driver is using the queues if it is the
+ * default driver.
+ */
+ if (ap_drv->flags & AP_DRIVER_FLAG_DEFAULT)
+ return 0;
+
+ /* The non-default driver's module must be loaded */
+ if (!try_module_get(drv->owner))
+ return 0;
+
+ if (ap_drv->in_use)
+ if (ap_drv->in_use(newapm, ap_perms.aqm))
+ rc = -EADDRINUSE;
ISTR that Christian suggested -EBUSY in a past revision of this series?
I think that would be more appropriate.

I went back and looked and sure enough, he did recommend that.
You have a great memory! I didn't respond to that comment, so I
must have missed it at the time.

I personally prefer EADDRINUSE because I think it is more indicative
of the reason an AP resource can not be assigned back to the host
drivers is because it is in use by a guest or, at the very least, reserved
for use by a guest (i.e., assigned to an mdev). To say it is busy implies
that the device is busy performing encryption services which may or
may not be true at a given moment. Even if so, that is not the reason
for refusing to allow reassignment of the device.


Also, I know we have discussed this before, but it is very hard to
figure out the offending device(s) if the sysfs manipulation failed. Can
we at least drop something into the syslog? That would be far from
perfect, but it gives an admin at least a chance to figure out why they
got an error. Some more structured way that would be usable from tools
can still be added later.

I see you found the patch that logged this:)


+
+ module_put(drv->owner);
+
+ return rc;
+}