Re: [PATCH 04/31] mm/pgtable: allow pte_offset_map[_lock]() to fail

From: Qi Zheng
Date: Tue May 23 2023 - 23:12:13 EST




On 2023/5/24 10:22, Hugh Dickins wrote:
On Mon, 22 May 2023, Qi Zheng wrote:
On 2023/5/22 12:53, Hugh Dickins wrote:

[...]

@@ -229,3 +231,57 @@ pmd_t pmdp_collapse_flush(struct vm_area_struct *vma,
unsigned long address,
}
#endif
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+
+pte_t *__pte_offset_map(pmd_t *pmd, unsigned long addr, pmd_t *pmdvalp)
+{
+ pmd_t pmdval;
+
+ /* rcu_read_lock() to be added later */
+ pmdval = pmdp_get_lockless(pmd);
+ if (pmdvalp)
+ *pmdvalp = pmdval;
+ if (unlikely(pmd_none(pmdval) || is_pmd_migration_entry(pmdval)))
+ goto nomap;
+ if (unlikely(pmd_trans_huge(pmdval) || pmd_devmap(pmdval)))
+ goto nomap;

Will the follow-up patch deal with the above situation specially?

No, the follow-up patch will only insert the rcu_read_lock() and unlock();
and do something (something!) about the PAE mismatched halves case.

Otherwise, maybe it can be changed to the following check method?

if (unlikely(pmd_none(pmdval) || pmd_leaf(pmdval)))
goto nomap;

Maybe, but I'm not keen. Partly just because pmd_leaf() is quite a
(good) new invention (I came across a few instances in updating to
the current tree), whereas here I'm just following the old examples,
from zap_pmd_range() etc. I'd have to spend a while getting to know
pmd_leaf(), and its interaction with strange gotchas like pmd_present().

And partly because I do want as many corrupt cases as possible to
reach the pmd_bad() check below, so generating a warning (and clear).
I might be wrong, I haven't checked through the architectures and how
pmd_leaf() is implemented in each, but my guess is that pmd_leaf()
will tend to miss the pmd_bad() check.

IIUC, pmd_leaf() is just for checking a leaf mapped PMD, and will
not cover pmd_bad() case. Can see the examples in vmalloc_to_page()
and apply_to_pmd_range().


But if you can demonstrate a performance improvement from using
pmd_leaf() there, I expect many people would prefer that improvement
to badness catching: send a patch later to make that change if it's
justified.

Probably not a lot of performance gain, just makes the check more
concise.

Thanks,
Qi


Thanks a lot for all your attention to these.

Hugh


+ if (unlikely(pmd_bad(pmdval))) {
+ pmd_clear_bad(pmd);
+ goto nomap;
+ }
+ return __pte_map(&pmdval, addr);
+nomap:
+ /* rcu_read_unlock() to be added later */
+ return NULL;
+}

--
Thanks,
Qi