Re: [PATCH v1] arm64/mm: avoid race condition of update page table when kernel init

From: David Hildenbrand
Date: Fri Dec 03 2021 - 13:13:37 EST


On 03.12.21 18:44, Catalin Marinas wrote:
> On Thu, Oct 28, 2021 at 08:36:07AM +0100, Jianyong Wu wrote:
>> From Anshuman Khandual <anshuman.khandual@xxxxxxx>:
>>> On 10/27/21 3:18 PM, Jianyong Wu wrote:
>>>> Race condition of page table update can happen in kernel boot period
>>>> as both of memory hotplug action when kernel init and the
>>>> mark_rodata_ro can update page table. For virtio-mem, the function excute flow chart is:
>>>>
>>>> -------------------------
>>>> kernel_init
>>>> kernel_init_freeable
>>>> ...
>>>> do_initcall
>>>> ...
>>>> module_init [A]
>>>>
>>>> ...
>>>> mark_readonly
>>>> mark_rodata_ro [B]
>>>> -------------------------
> [...]
>>>> We can see that the error derived from the l3 translation as the pte
>>>> value is *0*. That is because the fixmap has been clear when access.
>>>>
>>>> Signed-off-by: Jianyong Wu <jianyong.wu@xxxxxxx>
>>>> ---
>>>> arch/arm64/mm/mmu.c | 2 ++
>>>> 1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index
>>>> cfd9deb347c3..567dfba8f08a 100644
>>>> --- a/arch/arm64/mm/mmu.c
>>>> +++ b/arch/arm64/mm/mmu.c
>>>> @@ -564,8 +564,10 @@ void mark_rodata_ro(void)
>>>> * to cover NOTES and EXCEPTION_TABLE.
>>>> */
>>>> section_size = (unsigned long)__init_begin - (unsigned long)__start_rodata;
>>>> + get_online_mems();
>>>> update_mapping_prot(__pa_symbol(__start_rodata), (unsigned long)__start_rodata,
>>>> section_size, PAGE_KERNEL_RO);
>>>> + put_online_mems();
>>>>
>>>> debug_checkwx();
>>>> }
>>>
>>> While this should solve the current problem i.e race between concurrent
>>> memory hotplug operation and mark_rodata_ro(), but I am still wondering
>>> whether this is the fix at the right place and granularity. Basically a hotplug
>>> operation queued in an work queue at [A] can execute during [B] is the root
>>> cause of this problem.
>>
>> Not exactly, this issue doesn't only happen at the the *pure* kernel
>> boot. For example, hotplug memory through VM monitor when VM boot. We
>> can't foresee when that happen. Thus, this issue can affect all kinds
>> of memory hotplug mechanism, including ACPI based memory hotplug and
>> virtio-mem. I'm not sure that fix it here is the best way. If the race
>> only happens between kernel init and memory hotplug, I think it's fine
>> to fix it here. IMO, this issue results from the race for "fixmap"
>> resource. I wonder why this global resource is not protected by a
>> lock. Maybe we can add one and fix it there.
>
> IIUC the race is caused by multiple attempts to use the fixmap at the
> same time. We can add a fixmap_lock and hold it during
> __create_pgd_mapping().
>

IIRC that's something along the lines I suggested, so, yes :)

--
Thanks,

David / dhildenb