Re: Regression: x86/mm: new _PTE_SWP_SOFT_DIRTY bit conflicts withexisting use

From: Cyrill Gorcunov
Date: Wed Aug 21 2013 - 19:06:07 EST


On Wed, Aug 21, 2013 at 09:30:03AM -0700, Linus Torvalds wrote:
> Quite frankly, unless I see a patch later today that is
>
> (a) obvious
> (b) explains what is going on
> (c) tested
>
> I will be reverting the whole soft-dirty mess. I thought the
> bit-mapping games it played were already too complicated (the patch to
> pgtable-2level.h in commit 41bb3476b361 just makes me want to barf and
> came in very late, so I'm not positive about the whole soft-dirty mess
> in the first place). I really am not at all inclined to want to play
> games in this area any more. It's too damn late in the release window.

Hi all, I worked on patch which would not touch PSE bit for dirty page
tracking and the result is not that good:

- 2level pages now always page dirty if page is swapped in and out, because
there is no space left in PTE (other than PSE bit)

- only 3level pages scheme uses high 32bits to keep offset of swap entry,
x86-64 shifts offset up to _PAGE_BIT_GLOBAL + 1 bit, thus I need some
different bit nonunified with anything else for no reason :(

Summarizing all things

- Using PSE bit for swap entries as indicator of soft dirty page is safe because
swap entries as saved in pte as non-presen and when #pf happens kernel generates
valid pte entry from vma->vm_page_prot

- __swp_entry() helper is clearing PSE bit explicitly so even without softdirty
patch it's not saved once page reach swap (with softdirty tracking we simply
reuse this bit for own needs).

- Using PSE bit allows to not modify swap encoding on all 3 page schemes (2level,
3level, 4level) because it's a spare bit there not intersected with swap format.

Thus I would *_really_* like to save current scheme. Probably I should add comment
into header where _PAGE_SWP_SOFT_DIRTY defined that it's valid only when PRESENT
bit clear? Similar to

/* If _PAGE_BIT_PRESENT is clear, we use these: */
/* - if the user mapped it with PROT_NONE; pte_present gives true */
#define _PAGE_BIT_PROTNONE _PAGE_BIT_GLOBAL
/* - set: nonlinear file mapping, saved PTE; unset:swap */
#define _PAGE_BIT_FILE _PAGE_BIT_DIRTY

Have I conviced you guys?

The former problem report came from impression that this PSE bit may be touched
(set and clean) on present PTE, but it's not the case for pages being swapped.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/