Re: [PATCH -mm] mm, swap: Fix race between swapoff and some swap operations

From: Andrew Morton
Date: Thu Dec 07 2017 - 19:29:45 EST


On Thu, 7 Dec 2017 09:14:26 +0800 "Huang, Ying" <ying.huang@xxxxxxxxx> wrote:

> When the swapin is performed, after getting the swap entry information
> from the page table, the PTL (page table lock) will be released, then
> system will go to swap in the swap entry, without any lock held to
> prevent the swap device from being swapoff. This may cause the race
> like below,
>
> CPU 1 CPU 2
> ----- -----
> do_swap_page
> swapin_readahead
> __read_swap_cache_async
> swapoff swapcache_prepare
> p->swap_map = NULL __swap_duplicate
> p->swap_map[?] /* !!! NULL pointer access */
>
> Because swap off is usually done when system shutdown only, the race
> may not hit many people in practice. But it is still a race need to
> be fixed.

swapoff is so rare that it's hard to get motivated about any fix which
adds overhead to the regular codepaths.

Is there something we can do to ensure that all the overhead of this
fix is placed into the swapoff side? stop_machine() may be a bit
brutal, but a surprising amount of code uses it. Any other ideas?