Re: [PATCH] mm: kill frontswap

From: Johannes Weiner
Date: Wed Jul 19 2023 - 10:28:39 EST

Next message: Richard Weinberger: "Re: ubiblock: null pointer dereference using scatterlist in work_queue"
Previous message: Claudio Imbrenda: "Re: [PATCH v3 07/13] s390: add pte_free_defer() for pgtables sharing page"
In reply to: Yosry Ahmed: "Re: [PATCH] mm: kill frontswap"
Next in thread: Yosry Ahmed: "Re: [PATCH] mm: kill frontswap"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Yosry,

thanks for the review. I hope I saw everything you commented on ;) -
can you please trim your replies to the relevant hunks?

On Tue, Jul 18, 2023 at 11:52:45AM -0700, Yosry Ahmed wrote:
> On Mon, Jul 17, 2023 at 9:02 AM Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
> > -/*
> > - * "Get" data from frontswap associated with swaptype and offset that were
> > - * specified when the data was put to frontswap and use it to fill the
> > - * specified page with data. Page must be locked and in the swap cache.
> > - */
> > -int __frontswap_load(struct page *page)
> > -{
> > - int ret = -1;
> > - swp_entry_t entry = { .val = page_private(page), };
> > - int type = swp_type(entry);
> > - struct swap_info_struct *sis = swap_info[type];
> > - pgoff_t offset = swp_offset(entry);
> > - bool exclusive = false;
> > -
> > - VM_BUG_ON(!frontswap_ops);
> > - VM_BUG_ON(!PageLocked(page));
> > - VM_BUG_ON(sis == NULL);
> > -
> > - if (!__frontswap_test(sis, offset))
> > - return -1;
>
> With the removal of the above, it will be a bit slower to realize an
> entry is not in zswap and read it from disk (bitmask test vs. rbtree
> lookup). I guess in the swapin path (especially from disk), it would
> not matter much in practice. Just a note (mostly to myself).

I briefly considered moving that bitmap to zswap, but it actually
seems quite backwards. It adds overhead to the fast path, where
entries are in-cache, in order to optimize the cold path that requires
IO. As long as compression is faster than IO, zswap is expected to see
the (much) bigger share of transactions in any sane config.

> > @@ -1356,15 +1342,12 @@ static int zswap_frontswap_store(unsigned type, pgoff_t offset,
> >
> > /* map */
> > spin_lock(&tree->lock);
> > - do {
> > - ret = zswap_rb_insert(&tree->rbroot, entry, &dupentry);
> > - if (ret == -EEXIST) {
> > - zswap_duplicate_entry++;
> > - /* remove from rbtree */
> > - zswap_rb_erase(&tree->rbroot, dupentry);
> > - zswap_entry_put(tree, dupentry);
> > - }
> > - } while (ret == -EEXIST);
> > + while (zswap_rb_insert(&tree->rbroot, entry, &dupentry) == -EEXIST) {
> > + zswap_duplicate_entry++;
> > + /* remove from rbtree */
> > + zswap_rb_erase(&tree->rbroot, dupentry);
> > + zswap_entry_put(tree, dupentry);
>
> nit: it would be nice to replace the above two lines with
> zswap_invalidate_entry(), which also keeps it clear that we maintain
> the frontswap semantics of invalidating a duplicated entry.

Agreed, that's better. I'll send a follow-up.

> > @@ -1418,7 +1401,7 @@ static int zswap_frontswap_load(unsigned type, pgoff_t offset,
> > if (!entry) {
> > /* entry was written back */
>
> nit: the above comment is now obsolete. We may not find the entry
> because it was never stored in zswap in the first place (since we
> dropped the frontswap map, we won't know before we do the lookup
> here).

Good catch. I'll send a delta fix to Andrew.

> LGTM with a few nits above, probably they can be done in follow up
> patches. Thanks for the cleanup!
>
> FWIW:
> Acked-by: Yosry Ahmed <yosryahmed@xxxxxxxxxx>

Thanks!

Andrew, could you please fold this in?

---

Next message: Richard Weinberger: "Re: ubiblock: null pointer dereference using scatterlist in work_queue"
Previous message: Claudio Imbrenda: "Re: [PATCH v3 07/13] s390: add pte_free_defer() for pgtables sharing page"
In reply to: Yosry Ahmed: "Re: [PATCH] mm: kill frontswap"
Next in thread: Yosry Ahmed: "Re: [PATCH] mm: kill frontswap"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]