Re: [V9fs-developer] 9pfs hangs since 4.7

From: Al Viro
Date: Sat Jan 07 2017 - 12:19:28 EST


On Sat, Jan 07, 2017 at 04:10:45PM +0100, Greg Kurz wrote:
> > virtqueue_push(), but pdu freeing is delayed until v9fs_flush() gets woken
> > up. In the meanwhile, another request arrives into the slot of freed by
> > that virtqueue_push() and we are out of pdus.
> >
>
> Indeed. Even if this doesn't seem to be the problem here, I guess this should
> be fixed.

FWIW, there's something that looks like an off-by-one in
v9fs_device_realize_common():
/* initialize pdu allocator */
QLIST_INIT(&s->free_list);
QLIST_INIT(&s->active_list);
for (i = 0; i < (MAX_REQ - 1); i++) {
QLIST_INSERT_HEAD(&s->free_list, &s->pdus[i], next);
s->pdus[i].s = s;
s->pdus[i].idx = i;
}

Had been there since the original merge of 9p support into qemu - that code
had moved around a bit, but it had never inserted s->pdus[MAX_REQ - 1] into
free list. So your scenario with failing pdu_alloc() is still possible.
In that log the total amount of pending requests has reached 128 for the
first time right when the requests had stopped being handled and even
though it had dropped below that shortly after, extra requests being put
into queue had not been processed at all...

I'm not familiar with qemu guts enough to tell if that's a plausible scenario,
though... shouldn't subsequent queue insertions (after enough slots had been
released) simply trigger virtio_queue_notify_vq() again? It *is* a bug
(if we get a burst filling a previously empty queue all at once, there won't
be any slots becoming freed), but that's obviously not the case here -
slots were getting freed, after all.