Re: [PATCH v6 12/13] xen/pvcalls: implement release command

From: Stefano Stabellini
Date: Thu Oct 26 2017 - 14:21:43 EST


On Thu, 26 Oct 2017, Boris Ostrovsky wrote:
> On 10/25/2017 07:00 PM, Stefano Stabellini wrote:
> > On Wed, 25 Oct 2017, Boris Ostrovsky wrote:
> >> On 10/24/2017 01:33 PM, Stefano Stabellini wrote:
> >>> Send PVCALLS_RELEASE to the backend and wait for a reply. Take both
> >>> in_mutex and out_mutex to avoid concurrent accesses. Then, free the
> >>> socket.
> >>>
> >>> For passive sockets, check whether we have already pre-allocated an
> >>> active socket for the purpose of being accepted. If so, free that as
> >>> well.
> >>>
> >>> Signed-off-by: Stefano Stabellini <stefano@xxxxxxxxxxx>
> >>> CC: boris.ostrovsky@xxxxxxxxxx
> >>> CC: jgross@xxxxxxxx
> >>> ---
> >>> drivers/xen/pvcalls-front.c | 100 ++++++++++++++++++++++++++++++++++++++++++++
> >>> drivers/xen/pvcalls-front.h | 1 +
> >>> 2 files changed, 101 insertions(+)
> >>>
> >>> diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
> >>> index 4a413ff..7abc039 100644
> >>> --- a/drivers/xen/pvcalls-front.c
> >>> +++ b/drivers/xen/pvcalls-front.c
> >>> @@ -199,6 +199,23 @@ static irqreturn_t pvcalls_front_event_handler(int irq, void *dev_id)
> >>> static void pvcalls_front_free_map(struct pvcalls_bedata *bedata,
> >>> struct sock_mapping *map, bool locked)
> >>> {
> >>> + int i;
> >>> +
> >>> + unbind_from_irqhandler(map->active.irq, map);
> >>> +
> >>> + if (!locked)
> >>> + spin_lock(&bedata->socket_lock);
> >>> + if (!list_empty(&map->list))
> >>> + list_del_init(&map->list);
> >>> + if (!locked)
> >>> + spin_unlock(&bedata->socket_lock);
> >>> +
> >>> + for (i = 0; i < (1 << PVCALLS_RING_ORDER); i++)
> >>> + gnttab_end_foreign_access(map->active.ring->ref[i], 0, 0);
> >>> + gnttab_end_foreign_access(map->active.ref, 0, 0);
> >>> + free_page((unsigned long)map->active.ring);
> >>> +
> >>> + kfree(map);
> >>> }
> >>>
> >>> static irqreturn_t pvcalls_front_conn_handler(int irq, void *sock_map)
> >>> @@ -966,6 +983,89 @@ unsigned int pvcalls_front_poll(struct file *file, struct socket *sock,
> >>> return ret;
> >>> }
> >>>
> >>> +int pvcalls_front_release(struct socket *sock)
> >>> +{
> >>> + struct pvcalls_bedata *bedata;
> >>> + struct sock_mapping *map;
> >>> + int req_id, notify, ret;
> >>> + struct xen_pvcalls_request *req;
> >>> +
> >> ..
> >>
> >>> +
> >>> + if (map->active_socket) {
> >>> + /*
> >>> + * Set in_error and wake up inflight_conn_req to force
> >>> + * recvmsg waiters to exit.
> >>> + */
> >>> + map->active.ring->in_error = -EBADF;
> >>> + wake_up_interruptible(&map->active.inflight_conn_req);
> >>> +
> >>> + /*
> >>> + * Wait until there are no more waiters on the mutexes.
> >>> + * We know that no new waiters can be added because sk_send_head
> >>> + * is set to NULL -- we only need to wait for the existing
> >>> + * waiters to return.
> >>> + */
> >>> + while (!mutex_trylock(&map->active.in_mutex) ||
> >>> + !mutex_trylock(&map->active.out_mutex))
> >>> + cpu_relax();
> >>> +
> >>> + pvcalls_front_free_map(bedata, map, false);
> >>> + } else {
> >>> + spin_lock(&bedata->socket_lock);
> >>> + if (READ_ONCE(map->passive.inflight_req_id) !=
> >>> + PVCALLS_INVALID_ID) {
> >>> + pvcalls_front_free_map(bedata,
> >>> + map->passive.accept_map, true);
> >>> + }
> >>> + list_del(&map->list);
> >>> + kfree(map);
> >>> + spin_unlock(&bedata->socket_lock);
> >> We have different locking rules in pvcalls_front_free_map() for each of
> >> those clauses in that in the first case we are doing grant table
> >> operations and free_page() without the lock and in the second case we
> >> are holding it. Is it possible to restructure this so that we prune the
> >> lists under the lock (possibly in this routine) and call
> >> pvcalls_front_free_map() lock-less?
> > Yes, it is possible. However, pvcalls_front_free_map is called from a
> > couple of other places (pvcalls_front_accept and pvcalls_front_remove)
> > and we would have to add the code to remove the map from the list there
> > as well. I am not sure it is worth it.
> >
> > I don't have a strong opinion on this. Let me know which way you prefer.
>
>
> I didn't realize this is called from multiple places.
>
> How about
>
> spin_lock(&bedata->socket_lock);
> if (READ_ONCE(map->passive.inflight_req_id) !=
> PVCALLS_INVALID_ID) {
> if (!list_empty(&map->passive.accept_map->list))
> list_del(&map->passive.accept_map->list);
> }
> list_del(&map->list);
> spin_unlock(&bedata->socket_lock);
>
> pvcalls_front_free_map(bedata, map->passive.accept_map, true);
> kfree(map);
>
>
> (I may have messed up list pointers here)
>
> This would be slightly inefficient in that you drop the lock and then grab it again (only to find that the list is empty, presumably) but it makes pvcalls_front_free_map()'s behavior more consistent wrt locking. (Or maybe you will need to pass a bool to pvcalls_front_free_map() to indicate whether the map needs to be removed from the list for other call sites, in which case it will replace the 'locked' argument)

Reading the code again, I realized that we don't even need to call
pvcalls_front_free_map on accept_map with the lock held because with
"sock->sk->sk_send_head = NULL;" at the beginning of the function there
are no risks of concurrent manipulation of accept_map.

We can safely do:

spin_lock(&bedata->socket_lock);
list_del(&map->list);
spin_unlock(&bedata->socket_lock);

if (READ_ONCE(map->passive.inflight_req_id) !=
PVCALLS_INVALID_ID) {
pvcalls_front_free_map(bedata, map->passive.accept_map);
}
kfree(map);

and get rid of the ugly bool lock parameter of pvcalls_front_free_map.