Re: [RESEND PATCH v1 1/5] mm: vmalloc must set pte via arch code

From: Lorenzo Stoakes
Date: Mon May 15 2023 - 07:25:45 EST


On Mon, May 15, 2023 at 09:29:16AM +0100, Ryan Roberts wrote:
> Hi Lorenzo,
>
> Thanks for the review - I appreciate it!
>
>
> On 13/05/2023 14:14, Lorenzo Stoakes wrote:
> > You've not cc'd the vmalloc reviewers, including the author of 3e9a9e256b1e
> > whose patch you purport to fix. Please remember to run get_maintainers.pl
> > on all files you patch and cc them at least on relevant patches.
> >
> > Have added Christoph + Uladzislau as cc.
>
> I did run get_maintainers.pl, but it gave me 82 names. I assumed I wouldn't be
> making any friends by CCing everyone, so tried to choose what I thought was a
> sensible base. I guess I didn't quite get it right. Sorry about that. Thanks for
> noticing and adding the right people.

Right you mean across the whole of the patch set? Different people have
different approaches as to how to cc patch sets as a whole, but it's not
optional to include maintainers and reviewers on patches, so you should at least
cc- them on individual patches.

It's ok, it's really easy to mess this up, I have managed every variant of doing
this the wrong way myself... :)

>
> >
> > You'll definitely want an ack from Christoph on this!
> >
> > On Thu, May 11, 2023 at 02:21:09PM +0100, Ryan Roberts wrote:
> >> It is bad practice to directly set pte entries within a pte table.
> >> Instead all modifications must go through arch-provided helpers such as
> >> set_pte_at() to give the arch code visibility and allow it to validate
> >> (and potentially modify) the operation.
> >
> > This does make sense, and I see for example in xtensa that an arch-specific
> > instruction is issued under certain circumstances so I do suspect we should
> > do this.
>
> arm64 provides another example, where barriers are required to ensure the page
> table walker sees the new pte and no fault is raised. See
> arch/arm64/include/asm/pgtable.h:set_pte() (which is called by its
> implementation of set_pte_at()).

Ack, yeah I do think your patch is correct.

>
> >
> > As for validation, the function never indicates an error, so only in the
> > sense that a WARN_ON() could _in theory_ trigger is it being
> > validated. This might be quite a nitty point :) as set_pte_at() has no
> > means of indicating an error. But maybe to be pedantic 'check' rather than
> > 'validate'?
>
> I'm sorry, I'm not sure what you are asking here? set_pte_at() forms part of the
> contract with he arch code and is defined never to return an error. Some
> implementations might have code enabled in debug configs to detect incorrect
> usage and emit warnings (see arm64's implementation).

I'm saying that 'validate' implies to me that you assess whether the value is
correct and behave differently accordingly. It's something of a pedantic point,
but perhaps 'check' is better here.

>
> >
> >>
> >> Fixes: 3e9a9e256b1e ("mm: add a vmap_pfn function")
> >
> > Not sure if this is really 'fixing' anything, I mean ostensibly, but not
> > sure if the tag is relevant here, that is more so for a bug being
> > introduced, and unless an issue has arisen not sure if it's
> > appropriate. But this might be a nit, again!
>
> Well I'm happy to remove it if that's the concensus. But I do believe there is a
> real bug here. At least on arm64, the barriers are needed to prevent a race with
> the page table walker. That said, the only place in the tree I can see
> vmap_pfn() used, is in the i915 driver, which I guess has never been used on an
> arm64 platform.

Yeah, again this might be a little too nitty! And I totally understand where
you're coming from, I do agree this is appears to be an issue and your solution
is right, it just feels less like an obvious 'bug' and more of an oversight. But
I am being pedantic, and am not overly worried if you retain it :)

>
> >
> >> Signed-off-by: Ryan Roberts <ryan.roberts@xxxxxxx>
> >> ---
> >> mm/vmalloc.c | 5 ++++-
> >> 1 file changed, 4 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> >> index 9683573f1225..d8d2fe797c55 100644
> >> --- a/mm/vmalloc.c
> >> +++ b/mm/vmalloc.c
> >> @@ -2899,10 +2899,13 @@ struct vmap_pfn_data {
> >> static int vmap_pfn_apply(pte_t *pte, unsigned long addr, void *private)
> >> {
> >> struct vmap_pfn_data *data = private;
> >> + pte_t ptent;
> >>
> >> if (WARN_ON_ONCE(pfn_valid(data->pfns[data->idx])))
> >> return -EINVAL;
> >> - *pte = pte_mkspecial(pfn_pte(data->pfns[data->idx++], data->prot));
> >> +
> >> + ptent = pte_mkspecial(pfn_pte(data->pfns[data->idx++], data->prot));
> >> + set_pte_at(&init_mm, addr, pte, ptent);>
> > While we're refactoring, it'd be nice to stash data->pfns[data->idx] into a
> > local pfn variable.
>
> OK, I'll do this for v2.

Thanks!

>
> Thanks,
> Ryan
>
>
> >
> >> return 0;
> >> }
> >>
> >> --
> >> 2.25.1
> >>
>

Sorry to get into the weeds here a bit, overall I think this patch is fine, I
would like Christoph to take a look given it's his code however.