Re: Hang in 9p/virtio

From: Michael S. Tsirkin
Date: Tue Aug 02 2016 - 12:58:51 EST


On Tue, Aug 02, 2016 at 06:35:02PM +0200, Cornelia Huck wrote:
> On Tue, 2 Aug 2016 15:35:34 +0200
> Vegard Nossum <vegard.nossum@xxxxxxxxxx> wrote:
>
> > On 08/02/2016 11:13 AM, Vegard Nossum wrote:
> > > On 08/02/2016 11:03 AM, Cornelia Huck wrote:
> > >> On Sat, 30 Jul 2016 23:42:18 +0200
> > >> Vegard Nossum <vegard.nossum@xxxxxxxxxx> wrote:
> > >>
> > >>> Hi,
> > >>>
> > >>> With fault injection triggering an allocation failure for the
> > >>> alloc_indirect() call in virtqueue_add() I'm seeing a hang in
> > >>> p9_virtio_zc_request() -- it seems to be waiting here indefinitely
> > >>> (i.e. at least 120 seconds):
> > >>>
> > > [...]
> > >
> > >> What happens is that the code falls back to direct virtio addressing
> > >> (after indirect addressing failed) - and this should work.
> > >>
> > >> I'm more inclined to suspect a qemu instead of a kernel bug, as your
> > >> qemu version is quite old and there have been fixes in the virtio
> > >> buffer handling and virtio-9p in the meantime. (I'm suspecting
> > >> "virtio-9p: fix any_layout".)
> > >>
> > >> Could you retry with a more recent qemu (at least version 2.4)?
> > >
> > > I think maybe the version number in the stack trace is a bit misleading,
> > > this is the full/actual version:
> > >
> > > $ kvm --version
> > > QEMU emulator version 2.5.0 (Debian 1:2.5+dfsg-5ubuntu10.1), Copyright
> > > (c) 2003-2008 Fabrice Bellard
> > >
> > > I'll still try to get qemu from git and see if it makes a difference.
> > > Thanks,
> >
> > I still seem to get it:
> >
> > $ qemu-system-x86_64 --version
> > QEMU emulator version 2.6.91 (v2.7.0-rc1-2-gcc0100f-dirty), Copyright
> > (c) 2003-2008 Fabrice Bellard
>
> :(
>
> Sorry, no good immediate idea.
>
> One thing would be to check whether you get notified by qemu after the
> request was queued (i.e., whether vring_interrupt() ever gets called
> with 9p's req_done() after the alloc failure was injected). This would
> help to suggest whether to continue debugging here or in qemu.
>
> I still think the root of this error is some failure of the virtio 9p
> code to deal with non-indirect buffers, either in the driver or in qemu.

It might be interesting to just disable indirect buffers on qemu
command line by specifying indirect_desc=off.
This way you avoid using error paths.

--
MST