Re: [PATCH v1] mm: Enable suspend-only swap spaces

From: Evan Green
Date: Wed Jul 07 2021 - 18:22:57 EST


On Mon, Jul 5, 2021 at 12:44 AM David Hildenbrand <david@xxxxxxxxxx> wrote:
>
> On 30.06.21 19:07, Evan Green wrote:
> > Currently it's not possible to enable hibernation without also enabling
> > generic swap for a given swap area. These two use cases are not the
> > same. For example there may be users who want to enable hibernation,
> > but whose drives don't have the write endurance for generic swap
> > activities.
> >
> > Add a new SWAP_FLAG_NOSWAP that adds a swap region but refuses to allow
> > generic swapping to it. This region can still be wired up for use in
> > suspend-to-disk activities, but will never have regular pages swapped to
> > it.
>
> Just to confirm: things like /proc/meminfo won't show this "swap that's
> not actually swap" as free/total swap, correct? Maybe it's worth
> spelling the expected system behavior out here.

Currently these noswap regions do still count in /proc/meminfo. I
suppose as you say it makes more sense for it not to count. I should
be able to carefully put some conditionals around the nr_swap_pages
and total_swap_pages accounting to fix that. I'll also document this
in the commit text as suggested.

When looking at that, I realized something else. Hibernate uses
swap_free() to release its image pages, which calls free_swap_slot().
That may land the page in a swap_slots_cache, causing it to possibly
leak back into general usage. I'm thinking I should just call
swap_entry_free() directly if NOSWAP is set. I gave that a quick test
and so far it looks good.

Other random musings I had while staring that this code:

It surprised me that there's swap_entry_free() and
__swap_entry_free(), but the one without underscores is the lower
level one (ie __swap_entry_free() winds through the cache,
swap_entry_free() just does it). I'm not really sure if renaming those
is worth the churn or not: leaning towards no.

It's also interesting that scan_swap_map_slots() chooses whether or
not to attempt reclaim based on vm_swap_full(). vm_swap_full() returns
true if swap globally is 50% full or not. But hibernate is restricted
to a single swap device. So you could find yourself in a situation
where the hibernate device was full-but-reclaimable, and other areas
aren't very full. This might cause hibernations to fail because we
never attempted to reclaim swap. Maybe this never comes up in practice
because people don't use multiple swap devices. Or maybe we naturally
tend to spread the swap load evenly such that looking at the global
counts is roughly equivalent to looking at a single one. Shower
thoughts. I'll keep this in mind when I'm doing my own testing to see
if it ever comes up.

Thanks for the review, I'll plan to post a v2 in the next couple days.
-Evan