Re: Re: [PATCH v2] staging/android/ion : fix a race condition in the ion driver

From: EunTaik Lee
Date: Tue Feb 23 2016 - 06:20:06 EST



> From: Laura Abbott [mailto:labbott@xxxxxxxxxx]
> Sent: Saturday, February 20, 2016 5:09 AM
> To: eun.taik.lee@xxxxxxxxxxx; gregkh@xxxxxxxxxxxxxxxxxxx; arve@xxxxxxxxxxx;
> riandrews@xxxxxxxxxxx; sumit.semwal@xxxxxxxxxx; dan.carpenter@xxxxxxxxxx;
> Rohit Kumar <rohit.kr@xxxxxxxxxxx>; sriram@xxxxxxxxxxxxx; shawn.lin@rock-
> chips.com; devel@xxxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> euntaik@xxxxxxxxx
> Subject: Re: [PATCH v2] staging/android/ion : fix a race condition in the
> ion driver
>
> On 02/19/2016 04:03 AM, EunTaik Lee wrote:
> > There is a use-after-free problem in the ion driver.
> > This is caused by a race condition in the ion_ioctl() function.
> >
> > A handle has ref count of 1 and two tasks on different cpus calls
> > ION_IOC_FREE simultaneously.
> >
> > cpu 0 cpu 1
> > -------------------------------------------------------
> > ion_handle_get_by_id()
> > (ref == 2)
> > ion_handle_get_by_id()
> > (ref == 3)
> >
> > ion_free()
> > (ref == 2)
> >
> > ion_handle_put()
> > (ref == 1)
> >
> > ion_free()
> > (ref == 0 so ion_handle_destroy() is
> > called
> > and the handle is freed.)
> >
> > ion_handle_put() is called and it
> > decreases the slub's next free pointer
> >
> > The problem is detected as an unaligned access in the spin lock
> > functions since it uses load exclusive
> > instruction. In some cases it corrupts the slub's free pointer which
> > causes a mis-aligned access to the next free pointer.(kmalloc returns
> > a pointer like ffffc0745b4580aa). And it causes lots of other
> > hard-to-debug problems.
> >
> > This symptom is caused since the first member in the ion_handle
> > structure is the reference count and the ion driver decrements the
> > reference after it has been freed.
> >
> > To fix this problem client->lock mutex is extended to protect all the
> > codes that uses the handle.
> >
> > Signed-off-by: Eun Taik Lee <eun.taik.lee@xxxxxxxxxxx>
> > ---
> > changes in v2 :
> > 1. add problem description in the comment
> > 2. fix un-matching mutex_lock/unlock pair in ion_share_dma_buf()
> >
> > drivers/staging/android/ion/ion.c | 102
> ++++++++++++++++++++++++++++++--------
> > 1 file changed, 82 insertions(+), 20 deletions(-)
> >
> > diff --git a/drivers/staging/android/ion/ion.c
> > b/drivers/staging/android/ion/ion.c
> > index e237e9f..c6fbe48 100644
> > --- a/drivers/staging/android/ion/ion.c
> > +++ b/drivers/staging/android/ion/ion.c
> > @@ -385,13 +385,22 @@ static void ion_handle_get(struct ion_handle
> *handle)
> > kref_get(&handle->ref);
> > }
> >
> > +static int ion_handle_put_nolock(struct ion_handle *handle) {
> > + int ret;
> > +
> > + ret = kref_put(&handle->ref, ion_handle_destroy);
> > +
> > + return ret;
> > +}
> > +
>
> the
>
> > static int ion_handle_put(struct ion_handle *handle)
> > {
> > struct ion_client *client = handle->client;
> > int ret;
> >
> > mutex_lock(&client->lock);
> > - ret = kref_put(&handle->ref, ion_handle_destroy);
> > + ret = ion_handle_put_nolock(handle);
> > mutex_unlock(&client->lock);
> >
> > return ret;
> > @@ -415,20 +424,30 @@ static struct ion_handle *ion_handle_lookup(struct
> ion_client *client,
> > return ERR_PTR(-EINVAL);
> > }
> >
> > -static struct ion_handle *ion_handle_get_by_id(struct ion_client
> *client,
> > - int id)
> > +static struct ion_handle *ion_handle_get_by_id_nolock(struct ion_client
> *client,
> > + int id)
> > {
> > struct ion_handle *handle;
> >
> > - mutex_lock(&client->lock);
> > handle = idr_find(&client->idr, id);
> > if (handle)
> > ion_handle_get(handle);
> > - mutex_unlock(&client->lock);
> >
> > return handle ? handle : ERR_PTR(-EINVAL);
> > }
> >
> > +struct ion_handle *ion_handle_get_by_id(struct ion_client *client,
> > + int id)
> > +{
> > + struct ion_handle *handle;
> > +
> > + mutex_lock(&client->lock);
> > + handle = ion_handle_get_by_id_nolock(client, id);
> > + mutex_unlock(&client->lock);
> > +
> > + return handle;
> > +}
> > +
> > static bool ion_handle_validate(struct ion_client *client,
> > struct ion_handle *handle)
> > {
> > @@ -530,7 +549,8 @@ struct ion_handle *ion_alloc(struct ion_client
> *client, size_t len,
> > }
> > EXPORT_SYMBOL(ion_alloc);
> >
> > -void ion_free(struct ion_client *client, struct ion_handle *handle)
> > +static void ion_free_nolock(struct ion_client *client,
> > + struct ion_handle *handle)
> > {
> > bool valid_handle;
> >
> > @@ -538,15 +558,24 @@ void ion_free(struct ion_client *client, struct
> > ion_handle *handle)
> >
> > mutex_lock(&client->lock);
> > valid_handle = ion_handle_validate(client, handle);
> > -
> > if (!valid_handle) {
> > WARN(1, "%s: invalid handle passed to free.\n", __func__);
> > mutex_unlock(&client->lock);
> > return;
> > }
> > + ion_handle_put_nolock(handle);
> > +}
> > +
> > +void ion_free(struct ion_client *client, struct ion_handle *handle) {
> > + BUG_ON(client != handle->client);
> > +
> > + mutex_lock(&client->lock);
> > + ion_free_nolock(client, handle);
> > mutex_unlock(&client->lock);
> > ion_handle_put(handle);
> > }
> > +
> > EXPORT_SYMBOL(ion_free);
> >
>
> This still doesn't look right. ion_handle_put is being called twice on
> ion_free, once in ion_free_nolock and once again right after. Please
> double check this
>
Yes, that shouldn't have been there.
> > int ion_phys(struct ion_client *client, struct ion_handle *handle,
> > @@ -830,6 +859,7 @@ void ion_client_destroy(struct ion_client *client)
> > struct rb_node *n;
> >
> > pr_debug("%s: %d\n", __func__, __LINE__);
> > + mutex_lock(&client->lock);
> > while ((n = rb_first(&client->handles))) {
> > struct ion_handle *handle = rb_entry(n, struct ion_handle,
> > node);
> > @@ -837,6 +867,7 @@ void ion_client_destroy(struct ion_client *client)
> > }
> >
> > idr_destroy(&client->idr);
> > + mutex_unlock(&client->lock);
> >
>
> The mutex_lock here isn't necessary. This is the client destroy and
> handles are local to a client so there is nothing to protect here. If
> ion_client_destroy is being called on the same client at the same time we
> have bigger issues.
>
>
> > down_write(&dev->lock);
> > if (client->task)
> > @@ -1100,7 +1131,7 @@ static struct dma_buf_ops dma_buf_ops = {
> > .kunmap = ion_dma_buf_kunmap,
> > };
> >
> > -struct dma_buf *ion_share_dma_buf(struct ion_client *client,
> > +static struct dma_buf *ion_share_dma_buf_nolock(struct ion_client
> > +*client,
> > struct ion_handle *handle)
> > {
> > DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
> > @@ -1108,7 +1139,6 @@ struct dma_buf *ion_share_dma_buf(struct
> ion_client *client,
> > struct dma_buf *dmabuf;
> > bool valid_handle;
> >
> > - mutex_lock(&client->lock);
> > valid_handle = ion_handle_validate(client, handle);
> > if (!valid_handle) {
> > WARN(1, "%s: invalid handle passed to share.\n", __func__);
> @@
> > -1117,7 +1147,6 @@ struct dma_buf *ion_share_dma_buf(struct ion_client
> *client,
> > }
> > buffer = handle->buffer;
> > ion_buffer_get(buffer);
> > - mutex_unlock(&client->lock);
> >
> > exp_info.ops = &dma_buf_ops;
> > exp_info.size = buffer->size;
> > @@ -1132,14 +1161,26 @@ struct dma_buf *ion_share_dma_buf(struct
> > ion_client *client,
> >
> > return dmabuf;
> > }
> > +
> > +struct dma_buf *ion_share_dma_buf(struct ion_client *client,
> > + struct ion_handle *handle)
> > +{
> > + struct dma_buf *dmabuf;
> > +
> > + mutex_lock(&client->lock);
> > + dmabuf = ion_share_dma_buf_nolock(client, handle);
> > + mutex_unlock(&client->lock);
> > + return dmabuf;
> > +}
> > EXPORT_SYMBOL(ion_share_dma_buf);
> >
> > -int ion_share_dma_buf_fd(struct ion_client *client, struct ion_handle
> > *handle)
> > +static int ion_share_dma_buf_fd_nolock(struct ion_client *client,
> > + struct ion_handle *handle)
> > {
> > struct dma_buf *dmabuf;
> > int fd;
> >
> > - dmabuf = ion_share_dma_buf(client, handle);
> > + dmabuf = ion_share_dma_buf_nolock(client, handle);
> > if (IS_ERR(dmabuf))
> > return PTR_ERR(dmabuf);
> >
> > @@ -1149,6 +1190,17 @@ int ion_share_dma_buf_fd(struct ion_client
> > *client, struct ion_handle *handle)
> >
> > return fd;
> > }
> > +
> > +int ion_share_dma_buf_fd(struct ion_client *client, struct ion_handle
> > +*handle) {
> > + int fd;
> > +
> > + mutex_lock(&client->lock);
> > + fd = ion_share_dma_buf_fd_nolock(client, handle);
> > + mutex_unlock(&client->lock);
> > +
> > + return fd;
> > +}
> > EXPORT_SYMBOL(ion_share_dma_buf_fd);
> >
> > struct ion_handle *ion_import_dma_buf(struct ion_client *client, int
> > fd) @@ -1281,11 +1333,16 @@ static long ion_ioctl(struct file *filp,
> unsigned int cmd, unsigned long arg)
> > {
> > struct ion_handle *handle;
> >
> > - handle = ion_handle_get_by_id(client, data.handle.handle);
> > - if (IS_ERR(handle))
> > + mutex_lock(&client->lock);
> > + handle = ion_handle_get_by_id_nolock(client,
> > + data.handle.handle);
> > + if (IS_ERR(handle)) {
> > + mutex_unlock(&client->lock);
> > return PTR_ERR(handle);
> > - ion_free(client, handle);
> > - ion_handle_put(handle);
> > + }
> > + ion_free_nolock(client, handle);
> > + ion_handle_put_nolock(handle);
> > + mutex_unlock(&client->lock);
> > break;
> > }
> > case ION_IOC_SHARE:
> > @@ -1293,11 +1350,16 @@ static long ion_ioctl(struct file *filp,
> unsigned int cmd, unsigned long arg)
> > {
> > struct ion_handle *handle;
> >
> > - handle = ion_handle_get_by_id(client, data.handle.handle);
> > - if (IS_ERR(handle))
> > + mutex_lock(&client->lock);
> > + handle = ion_handle_get_by_id_nolock(client,
> > + data.handle.handle);
> > + if (IS_ERR(handle)) {
> > + mutex_unlock(&client->lock);
> > return PTR_ERR(handle);
> > - data.fd.fd = ion_share_dma_buf_fd(client, handle);
> > - ion_handle_put(handle);
> > + }
> > + data.fd.fd = ion_share_dma_buf_fd_nolock(client, handle);
> > + ion_handle_put_nolock(handle);
> > + mutex_unlock(&client->lock);
> > if (data.fd.fd < 0)
> > ret = data.fd.fd;
> > break;
> >
>
> I don't think this is necessary. We had the race in ION_IOC_FREE because
> the free operation didn't happen atomically. It was possible to have two
> different threads destroying the handle at the same time. With
> ION_IOC_MAP/ION_IOC_SHARE, ion_handle_get_by_id will get a reference so
> assuming there are no other races, that should ensure the handle will not
> be destroyed.
>
> Is there another race you can see in the code that I missed?
>
I was thinking about ion_client_destroy being called when ION_IOC_MAP/ION_IOC_SHARE is executing.
But I don't think that is possible. So I agree that we don't need to protect ION_IOC_MAP/ION_IOC_SHARE
and ion_client_destroy with the mutex.

Thanks,
Euntaik