Re: [PATCH] net/p9/trans_fd.c: fix double list_del()

From: Dominique Martinet
Date: Mon Jul 23 2018 - 08:57:21 EST


Tomas Bortoli wrote on Mon, Jul 23, 2018:
> A double list_del(&req->req_list) is possible in p9_fd_cancel() as
> shown by Syzbot. To prevent it we have to ensure that we have the
> client->lock when deleting the list. Furthermore, we have to update
> the status of the request before releasing the lock, to prevent the
> race.

Nice, so no need to change the list_del to list_del_init!

I still have a nitpick on the last moved unlock, but it's mostly
aesthetic - the change looks much better to me now.

(Since that will require a v2 I'll be evil and go further than Yiwen
about the commit message: let it breathe a bit! :) I think a line break
before "furthermore" for example will make it easier to read)

>
> Signed-off-by: Tomas Bortoli <tomasbortoli@xxxxxxxxx>
> Reported-by: syzbot+735d926e9d1317c3310c@xxxxxxxxxxxxxxxxxxxxxxxxx
> ---
> net/9p/trans_fd.c | 9 ++++-----
> 1 file changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
> index a64b01c56e30..370c6c69a05c 100644
> --- a/net/9p/trans_fd.c
> +++ b/net/9p/trans_fd.c
> @@ -199,15 +199,14 @@ static void p9_mux_poll_stop(struct p9_conn *m)
> static void p9_conn_cancel(struct p9_conn *m, int err)
> {
> struct p9_req_t *req, *rtmp;
> - unsigned long flags;
> LIST_HEAD(cancel_list);
>
> p9_debug(P9_DEBUG_ERROR, "mux %p err %d\n", m, err);
>
> - spin_lock_irqsave(&m->client->lock, flags);
> + spin_lock(&m->client->lock);
>
> if (m->err) {
> - spin_unlock_irqrestore(&m->client->lock, flags);
> + spin_unlock(&m->client->lock);
> return;
> }
>
> @@ -219,7 +218,6 @@ static void p9_conn_cancel(struct p9_conn *m, int err)
> list_for_each_entry_safe(req, rtmp, &m->unsent_req_list, req_list) {
> list_move(&req->req_list, &cancel_list);
> }
> - spin_unlock_irqrestore(&m->client->lock, flags);
>
> list_for_each_entry_safe(req, rtmp, &cancel_list, req_list) {
> p9_debug(P9_DEBUG_ERROR, "call back req %p\n", req);
> @@ -228,6 +226,7 @@ static void p9_conn_cancel(struct p9_conn *m, int err)
> req->t_err = err;
> p9_client_cb(m->client, req, REQ_STATUS_ERROR);
> }
> + spin_unlock(&m->client->lock);
> }
>
> static __poll_t
> @@ -370,12 +369,12 @@ static void p9_read_work(struct work_struct *work)
> if (m->req->status != REQ_STATUS_ERROR)
> status = REQ_STATUS_RCVD;
> list_del(&m->req->req_list);
> - spin_unlock(&m->client->lock);
> p9_client_cb(m->client, m->req, status);
> m->rc.sdata = NULL;
> m->rc.offset = 0;
> m->rc.capacity = 0;
> m->req = NULL;
> + spin_unlock(&m->client->lock);

It took me a while to understand why you extended this lock despite
having just read the commit message, I'd suggest:
- moving the spin_unlock to right after p9_client_cb (afterall that's
what we want, the m->rc and m->req don't need to be protected)
- add a comment before p9_client_cb saying something like 'updates
req->status' or try to explain why it needs to be locked here but other
transports don't need such a lock (they're not dependant on req->status
like this)

--
Dominique