Re: [PATCH 3/3] xen-blkfront: dynamic configuration of per-vbd resources

From: Roger Pau Monné
Date: Fri Jul 22 2016 - 05:34:18 EST


On Fri, Jul 22, 2016 at 04:17:48PM +0800, Bob Liu wrote:
>
> On 07/22/2016 03:45 PM, Roger Pau Monné wrote:
> > On Thu, Jul 21, 2016 at 06:08:05PM +0800, Bob Liu wrote:
> >>
> >> On 07/21/2016 04:57 PM, Roger Pau Monné wrote:
> ..[snip]..
> >>>> +
> >>>> +static ssize_t dynamic_reconfig_device(struct blkfront_info *info, ssize_t count)
> >>>> +{
> >>>> + unsigned int i;
> >>>> + int err = -EBUSY;
> >>>> +
> >>>> + /*
> >>>> + * Make sure no migration in parallel, device lock is actually a
> >>>> + * mutex.
> >>>> + */
> >>>> + if (!device_trylock(&info->xbdev->dev)) {
> >>>> + pr_err("Fail to acquire dev:%s lock, may be in migration.\n",
> >>>> + dev_name(&info->xbdev->dev));
> >>>> + return err;
> >>>> + }
> >>>> +
> >>>> + /*
> >>>> + * Prevent new requests and guarantee no uncompleted reqs.
> >>>> + */
> >>>> + blk_mq_freeze_queue(info->rq);
> >>>> + if (part_in_flight(&info->gd->part0))
> >>>> + goto out;
> >>>> +
> >>>> + /*
> >>>> + * Front Backend
> >>>> + * Switch to XenbusStateClosed
> >>>> + * frontend_changed():
> >>>> + * case XenbusStateClosed:
> >>>> + * xen_blkif_disconnect()
> >>>> + * Switch to XenbusStateClosed
> >>>> + * blkfront_resume():
> >>>> + * frontend_changed():
> >>>> + * reconnect
> >>>> + * Wait until XenbusStateConnected
> >>>> + */
> >>>> + info->reconfiguring = true;
> >>>> + xenbus_switch_state(info->xbdev, XenbusStateClosed);
> >>>> +
> >>>> + /* Poll every 100ms, 1 minute timeout. */
> >>>> + for (i = 0; i < 600; i++) {
> >>>> + /*
> >>>> + * Wait backend enter XenbusStateClosed, blkback_changed()
> >>>> + * will clear reconfiguring.
> >>>> + */
> >>>> + if (!info->reconfiguring)
> >>>> + goto resume;
> >>>> + schedule_timeout_interruptible(msecs_to_jiffies(100));
> >>>> + }
> >>>
> >>> Instead of having this wait, could you just set info->reconfiguring = 1, set
> >>> the frontend state to XenbusStateClosed and mimic exactly what a resume from
> >>> suspension does? blkback_changed would have to set the frontend state to
> >>> InitWait when it detects that the backend has switched to Closed, and call
> >>> blkfront_resume.
> >>
> >>
> >> I think that won't work.
> >> In the real "resume" case, the power management system will trigger all ->resume() path.
> >> But there is no place for dynamic configuration.
> >
> > Hello,
> >
> > I think it should be possible to set info->reconfiguring and wait for the
> > backend to switch to state Closed, at that point we should call blkif_resume
> > (from blkback_changed) and the backend will follow the reconection.
> >
>
> Okay, I get your point. Yes, that's an option.
>
> But this will make 'dynamic configuration' to be async, I'm worry about the end-user will get panic.
> E.g
> A end-user "echo <new value> > /sys/devices/vbd-xxx/max_indirect_segs",
> but then the device will be Closed and disappeared, the user have to wait for a random time so that the device can resume.

That should not happen, AFAICT on migration the device never dissapears.
alloc_disk and friends should not be called on resume from migration (see
the switch in blkfront_connect, you should take the BLKIF_STATE_SUSPENDED
path for the reconfiguration).

Roger.