RE: [Patch v3 2/3] Drivers: hv: add Azure Blob driver

From: Long Li
Date: Fri Jul 16 2021 - 15:29:43 EST




> -----Original Message-----
> From: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> Sent: Friday, July 16, 2021 10:28 AM
> To: Michael Kelley <mikelley@xxxxxxxxxxxxx>
> Cc: longli@xxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-
> hyperv@xxxxxxxxxxxxxxx; Long Li <longli@xxxxxxxxxxxxx>; Jonathan Corbet
> <corbet@xxxxxxx>; KY Srinivasan <kys@xxxxxxxxxxxxx>; Haiyang Zhang
> <haiyangz@xxxxxxxxxxxxx>; Stephen Hemminger
> <sthemmin@xxxxxxxxxxxxx>; Wei Liu <wei.liu@xxxxxxxxxx>; Dexuan Cui
> <decui@xxxxxxxxxxxxx>; Bjorn Andersson <bjorn.andersson@xxxxxxxxxx>;
> Hans de Goede <hdegoede@xxxxxxxxxx>; Williams, Dan J
> <dan.j.williams@xxxxxxxxx>; Maximilian Luz <luzmaximilian@xxxxxxxxx>;
> Mike Rapoport <rppt@xxxxxxxxxx>; Ben Widawsky
> <ben.widawsky@xxxxxxxxx>; Jiri Slaby <jirislaby@xxxxxxxxxx>; Andra
> Paraschiv <andraprs@xxxxxxxxxx>; Siddharth Gupta
> <sidgup@xxxxxxxxxxxxxx>; Hannes Reinecke <hare@xxxxxxx>; linux-
> doc@xxxxxxxxxxxxxxx
> Subject: Re: [Patch v3 2/3] Drivers: hv: add Azure Blob driver
>
> On Fri, Jul 16, 2021 at 03:56:14PM +0000, Michael Kelley wrote:
> > > +static int az_blob_remove(struct hv_device *dev) {
> > > + az_blob_dev.removing = true;
> > > + /*
> > > + * RCU lock guarantees that any future calls to az_blob_fop_open()
> > > + * can not use device resources while the inode reference of
> > > + * /dev/azure_blob is being held for that open, and device file is
> > > + * being removed from /dev.
> > > + */
> > > + synchronize_rcu();
> >
> > I don't think this works as you have described. While it will ensure
> > that any in-progress RCU read-side critical sections have completed
> > (i.e., in az_blob_fop_open() ), it does not prevent new read-side
> > critical sections from being started. So as soon as synchronize_rcu()
> > is finished, another thread could find and open the device, and be
> > executing in az_blob_fop_open().
> >
> > But it's not clear to me that this (and the rcu_read_locks in
> > az_blob_fop_open) are really needed anyway. If
> > az_blob_remove_device() is called while one or more threads have it open,
> I think that's OK. Or is there a scenario that I'm missing?
>
> This should not be different from any other tiny character device, why the
> mess with RCU at all?
>
> > > + az_blob_info("removing device\n");
> > > + az_blob_remove_device();
> > > +
> > > + /*
> > > + * At this point, it's not possible to open more files.
> > > + * Wait for all the opened files to be released.
> > > + */
> > > + wait_event(az_blob_dev.file_wait,
> > > +list_empty(&az_blob_dev.file_list));
> >
> > As mentioned in my most recent comments on the previous version of the
> > code, I'm concerned about waiting for all open files to be released in
> > the case of a VMbus rescind. We definitely have to wait for all VSP
> > operations to complete, but that's different from waiting for the
> > files to be closed. The former depends on Hyper-V being well-behaved
> > and will presumably happen quickly in the case of a rescind. But the
> > latter depends entirely on user space code that is out of the kernel's
> > control. The user space process could hang around for hours or days
> > with the file still open, which would block this function from completing,
> and hence block the global work queue.
> >
> > Thinking about this, the core issue may be that having a single static
> > instance of az_blob_device is problematic if we allow the device to be
> > removed (either explicitly with an unbind, or implicitly with a VMbus
> > rescind) and then re-added. Perhaps what needs to happen is that the
> > removed device is allowed to continue to exist as long as user space
> > processes have an open file handle, but they can't perform any
> > operations because the "removing" flag is set (and stays set).
> > If the device is re-added, then a new instance of az_blob_device
> > should be created, and whether or not the old instance is still
> > hanging around is irrelevant.
>
> You should never have a single static copy of the device, that was going to be
> my first review comment once this all actually got to a place that made sense
> to review (which it is not even there yet.) When you do that, then you have
> these crazy race issues you speak of. Use the misc api correctly and you will
> not have any of these problems, why people try to make it harder is beyond
> me...
>
> thanks,
>
> greg k-h

I will address all the comments and send the driver for broader review including
linux-fsdevel and linux-block.

Thanks,
Long