Re: [PATCH] zsmalloc: Fix races between asynchronous zspage free and page migration

From: Sultan Alsawaf
Date: Wed May 11 2022 - 17:45:40 EST


On Wed, May 11, 2022 at 02:07:19PM -0700, Minchan Kim wrote:
> Then, how about this?

Your proposal is completely wrong still. My original patch is fine; we can stick
with that.

> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index 9152fbde33b5..2f205c18aee4 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -1716,12 +1716,31 @@ static enum fullness_group putback_zspage(struct size_class *class,
> * To prevent zspage destroy during migration, zspage freeing should
> * hold locks of all pages in the zspage.
> */
> -static void lock_zspage(struct zspage *zspage)
> +static void lock_zspage(struct zs_pool *pool, struct zspage *zspage)
> {
> - struct page *page = get_first_page(zspage);
> -
> + struct page *page;
> + int nr_locked;
> + struct page *locked_pages[ZS_MAX_PAGES_PER_ZSPAGE];
> + struct address_space *mapping;
> +retry:
> + nr_locked = 0;
> + memset(locked_pages, 0, sizeof(struct page) * ARRAY_SIZE(locked_pages));

This memset() zeroes out memory past the end of the array because it is an array
of pointers, not an array of page structs; the sizeof() is incorrect.

> + page = get_first_page(zspage);

You can't use get_first_page() outside of the migrate lock.

> do {
> lock_page(page);

You can't lock a page that you don't own.

> + locked_pages[nr_locked++] = page;
> + /*
> + * subpage in the zspage could be migrated under us so
> + * verify it. Once it happens, retry the lock sequence.
> + */
> + mapping = page_mapping(page)

This doesn't compile.

> + if (mapping != pool->inode->i_mapping ||
> + page_private(page) != (unsigned long)zspage) {
> + do {
> + unlock_page(locked_pages[--nr_locked]);
> + } while (nr_locked > 0)

This doesn't compile.

> + goto retry;
> + }

There's no need to unlock all of the pages that were locked so far because once
a page is locked, it cannot be migrated.

> } while ((page = get_next_page(page)) != NULL);
> }

You can't use get_next_page() outside of the migrate lock.

>
> @@ -1987,7 +2006,7 @@ static void async_free_zspage(struct work_struct *work)
>
> list_for_each_entry_safe(zspage, tmp, &free_pages, list) {
> list_del(&zspage->list);
> - lock_zspage(zspage);
> + lock_zspage(pool, zspage);
>
> get_zspage_mapping(zspage, &class_idx, &fullness);
> VM_BUG_ON(fullness != ZS_EMPTY);

Sultan