Re: [RESEND PATCH v1 1/5] mm: vmalloc must set pte via arch code

From: Ryan Roberts
Date: Mon May 15 2023 - 08:14:48 EST


On 15/05/2023 12:25, Lorenzo Stoakes wrote:
> On Mon, May 15, 2023 at 09:29:16AM +0100, Ryan Roberts wrote:
>> Hi Lorenzo,
>>
>> Thanks for the review - I appreciate it!
>>
>>
>> On 13/05/2023 14:14, Lorenzo Stoakes wrote:
>>> You've not cc'd the vmalloc reviewers, including the author of 3e9a9e256b1e
>>> whose patch you purport to fix. Please remember to run get_maintainers.pl
>>> on all files you patch and cc them at least on relevant patches.
>>>
>>> Have added Christoph + Uladzislau as cc.
>>
>> I did run get_maintainers.pl, but it gave me 82 names. I assumed I wouldn't be
>> making any friends by CCing everyone, so tried to choose what I thought was a
>> sensible base. I guess I didn't quite get it right. Sorry about that. Thanks for
>> noticing and adding the right people.
>
> Right you mean across the whole of the patch set? Different people have
> different approaches as to how to cc patch sets as a whole, but it's not
> optional to include maintainers and reviewers on patches, so you should at least
> cc- them on individual patches.
>
> It's ok, it's really easy to mess this up, I have managed every variant of doing
> this the wrong way myself... :)

Well I look forward to tripping over all the other variants in due course. ;-)

>
>>
>>>
>>> You'll definitely want an ack from Christoph on this!
>>>
>>> On Thu, May 11, 2023 at 02:21:09PM +0100, Ryan Roberts wrote:
>>>> It is bad practice to directly set pte entries within a pte table.
>>>> Instead all modifications must go through arch-provided helpers such as
>>>> set_pte_at() to give the arch code visibility and allow it to validate
>>>> (and potentially modify) the operation.
>>>
>>> This does make sense, and I see for example in xtensa that an arch-specific
>>> instruction is issued under certain circumstances so I do suspect we should
>>> do this.
>>
>> arm64 provides another example, where barriers are required to ensure the page
>> table walker sees the new pte and no fault is raised. See
>> arch/arm64/include/asm/pgtable.h:set_pte() (which is called by its
>> implementation of set_pte_at()).
>
> Ack, yeah I do think your patch is correct.
>
>>
>>>
>>> As for validation, the function never indicates an error, so only in the
>>> sense that a WARN_ON() could _in theory_ trigger is it being
>>> validated. This might be quite a nitty point :) as set_pte_at() has no
>>> means of indicating an error. But maybe to be pedantic 'check' rather than
>>> 'validate'?
>>
>> I'm sorry, I'm not sure what you are asking here? set_pte_at() forms part of the
>> contract with he arch code and is defined never to return an error. Some
>> implementations might have code enabled in debug configs to detect incorrect
>> usage and emit warnings (see arm64's implementation).
>
> I'm saying that 'validate' implies to me that you assess whether the value is
> correct and behave differently accordingly. It's something of a pedantic point,
> but perhaps 'check' is better here.

Ahh, you were critiqing the commit message, sorry totally missed that. I'll
change 'validate' to 'check' in v2.

>
>>
>>>
>>>>
>>>> Fixes: 3e9a9e256b1e ("mm: add a vmap_pfn function")
>>>
>>> Not sure if this is really 'fixing' anything, I mean ostensibly, but not
>>> sure if the tag is relevant here, that is more so for a bug being
>>> introduced, and unless an issue has arisen not sure if it's
>>> appropriate. But this might be a nit, again!
>>
>> Well I'm happy to remove it if that's the concensus. But I do believe there is a
>> real bug here. At least on arm64, the barriers are needed to prevent a race with
>> the page table walker. That said, the only place in the tree I can see
>> vmap_pfn() used, is in the i915 driver, which I guess has never been used on an
>> arm64 platform.
>
> Yeah, again this might be a little too nitty! And I totally understand where
> you're coming from, I do agree this is appears to be an issue and your solution
> is right, it just feels less like an obvious 'bug' and more of an oversight. But
> I am being pedantic, and am not overly worried if you retain it :)

OK, I'm going to retain it.

>
>>
>>>
>>>> Signed-off-by: Ryan Roberts <ryan.roberts@xxxxxxx>
>>>> ---
>>>> mm/vmalloc.c | 5 ++++-
>>>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>>>> index 9683573f1225..d8d2fe797c55 100644
>>>> --- a/mm/vmalloc.c
>>>> +++ b/mm/vmalloc.c
>>>> @@ -2899,10 +2899,13 @@ struct vmap_pfn_data {
>>>> static int vmap_pfn_apply(pte_t *pte, unsigned long addr, void *private)
>>>> {
>>>> struct vmap_pfn_data *data = private;
>>>> + pte_t ptent;
>>>>
>>>> if (WARN_ON_ONCE(pfn_valid(data->pfns[data->idx])))
>>>> return -EINVAL;
>>>> - *pte = pte_mkspecial(pfn_pte(data->pfns[data->idx++], data->prot));
>>>> +
>>>> + ptent = pte_mkspecial(pfn_pte(data->pfns[data->idx++], data->prot));
>>>> + set_pte_at(&init_mm, addr, pte, ptent);>
>>> While we're refactoring, it'd be nice to stash data->pfns[data->idx] into a
>>> local pfn variable.
>>
>> OK, I'll do this for v2.
>
> Thanks!
>
>>
>> Thanks,
>> Ryan
>>
>>
>>>
>>>> return 0;
>>>> }
>>>>
>>>> --
>>>> 2.25.1
>>>>
>>
>
> Sorry to get into the weeds here a bit, overall I think this patch is fine, I
> would like Christoph to take a look given it's his code however.

No problem; I'm new here, so just having someone taking the time to respond with
specific feedback, is a win as far as I'm concerned!

Thanks,
Ryan