[RFC PATCH] nvme: allow specific passthrough IOs without CAP_SYSADMIN

From: Logan Gunthorpe
Date: Fri Oct 01 2021 - 19:40:33 EST


The passthrough IOCTL interface allows for prototyping new non-standard
NVMe features in userspace. However, all passthrough commands require
full CAP_SYSADMIN over and above file access to the device. This means
applications must run as root when running proof of concepts which is
not often desirable.

Instead, relax that requirement for vendor specific commands as well
as identify and get_log_page admin commands (which both have vendor
specific components). Identify and get_log_page only query information
from the controller so users with this privilege shouldn't be able to
cause any negative side effects and vendor specific commands are the
vendors responsibility to avoid dangerous side effects.

Users that want to send any of these passthrough commands will still
require access to the NVMe char device or namespace. Typically, the
char device is only accessible by root anyway and namespaces are
accessible by root and the disk group. Administrators are free to
add udev rules to adjust these permissions for specific devices they
want to allow.

Signed-off-by: Logan Gunthorpe <logang@xxxxxxxxxxxx>
---

Hi,

Wondering what people might think of loosening these restrictions a
touch with this RFC patch. Open to other options if people have them.

This will also become more generally useful with Joshi's io_uring work
which enables asynchronous passthrough IOs.

Thanks,

Logan

drivers/nvme/host/ioctl.c | 26 ++++++++++++++++++++++----
include/linux/nvme.h | 1 +
2 files changed, 23 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index 22314962842d..3411269194e1 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -187,6 +187,24 @@ static bool nvme_validate_passthru_nsid(struct nvme_ctrl *ctrl,
return true;
}

+static bool nvme_user_cmd_allowed(struct nvme_ns *ns, int opcode)
+{
+ if (ns) {
+ if (opcode >= nvme_cmd_vendor_start)
+ return true;
+ } else {
+ if (opcode >= nvme_admin_vendor_start)
+ return true;
+ switch (opcode) {
+ case nvme_admin_identify:
+ case nvme_admin_get_log_page:
+ return true;
+ }
+ }
+
+ return capable(CAP_SYS_ADMIN);
+}
+
static int nvme_user_cmd(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
struct nvme_passthru_cmd __user *ucmd)
{
@@ -196,10 +214,10 @@ static int nvme_user_cmd(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
u64 result;
int status;

- if (!capable(CAP_SYS_ADMIN))
- return -EACCES;
if (copy_from_user(&cmd, ucmd, sizeof(cmd)))
return -EFAULT;
+ if (!nvme_user_cmd_allowed(ns, cmd.opcode))
+ return -EACCES;
if (cmd.flags)
return -EINVAL;
if (!nvme_validate_passthru_nsid(ctrl, ns, cmd.nsid))
@@ -242,10 +260,10 @@ static int nvme_user_cmd64(struct nvme_ctrl *ctrl, struct nvme_ns *ns,
unsigned timeout = 0;
int status;

- if (!capable(CAP_SYS_ADMIN))
- return -EACCES;
if (copy_from_user(&cmd, ucmd, sizeof(cmd)))
return -EFAULT;
+ if (!nvme_user_cmd_allowed(ns, cmd.opcode))
+ return -EACCES;
if (cmd.flags)
return -EINVAL;
if (!nvme_validate_passthru_nsid(ctrl, ns, cmd.nsid))
diff --git a/include/linux/nvme.h b/include/linux/nvme.h
index b7c4c4130b65..8d36dcf6d2a4 100644
--- a/include/linux/nvme.h
+++ b/include/linux/nvme.h
@@ -692,6 +692,7 @@ enum nvme_opcode {
nvme_cmd_zone_mgmt_send = 0x79,
nvme_cmd_zone_mgmt_recv = 0x7a,
nvme_cmd_zone_append = 0x7d,
+ nvme_cmd_vendor_start = 0x80,
};

#define nvme_opcode_name(opcode) { opcode, #opcode }

base-commit: 5816b3e6577eaa676ceb00a848f0fd65fe2adc29
--
2.30.2