Re: [PATCH v2 3/5] nvme-fabrics: introduce ref counting for nvmf_ctrl_options

From: Hannes Reinecke
Date: Mon Mar 11 2024 - 15:29:08 EST

Next message: Philipp Hortmann: "Re: [PATCH] staging: p80211conv: Rename local foo to decrypt_check"
Previous message: Peter Xu: "Re: BUG selftests/mm]"
In reply to: Daniel Wagner: "Re: [PATCH v2 3/5] nvme-fabrics: introduce ref counting for nvmf_ctrl_options"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 3/11/24 18:36, Daniel Wagner wrote:

On Thu, Mar 07, 2024 at 12:27:43PM +0200, Sagi Grimberg wrote:

Why do we need a refcount for an object that has the same exact lifetime
as the ctrl itself? It just feels like unneeded complication.

My claim the UAF is also possible with the current code is not correct.
Or at least not easy to reproduce. I've re-tested a lot and I couldn't
reproduce it.

Though, the UAF is very simple to reproduce with the sync connect patch
applied (nvme-fc: wait for initial connect attempt to finish) together
with Hannes' patch (nvme: authentication error are always
non-retryable):

In this case, the initial connect fails and the resources are removed,
while we are waiting in

+ if (!opts->connect_async) {
+ enum nvme_ctrl_state state;
+
+ wait_for_completion(&ctrl->connect_completion);
+ state = nvme_ctrl_state(&ctrl->ctrl);
+ nvme_fc_ctrl_put(ctrl);
+
+ if (state != NVME_CTRL_LIVE) {
+ /* Cleanup is handled by the connect state machine */
+ return ERR_PTR(-EIO);
+ }
+ }

This opens up the race window. While we are waiting here for the
completion, the ctrl entry in sysfs is still reachable. Unfortunately,
we also fire an uevent which starts another instance of nvme-cli. And
the new instance of nvme-cli iterates over sysfs and reads the already
freed options object.

Curiously enough, I had been digging into better error reporting for nvme-fabrics. And the one thing I came up with is to make the controller
_options_ as a private pointer to seq_file.
With that we can allocate and initialize the options during open(),
and then have write() do the parsing and calling create_ctrl() as usual.
But read() would then always have access to the option structure, and
we can use this structure to pass any errors. EG parsing errors could
be reported by an 'err_mask' field and so on.

That would allow us to report errors back to nvme-cli, and,
incidentally, also require reference counting.
Two stones with a bird and all that.

Patch is in testing, and I'll be posting it once I get confirmation.

Cheers,

Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@xxxxxxx +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

Next message: Philipp Hortmann: "Re: [PATCH] staging: p80211conv: Rename local foo to decrypt_check"
Previous message: Peter Xu: "Re: BUG selftests/mm]"
In reply to: Daniel Wagner: "Re: [PATCH v2 3/5] nvme-fabrics: introduce ref counting for nvmf_ctrl_options"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]