Re: A logical error in arch/x86/mm/init.c

From: Juergen Gross
Date: Fri Feb 04 2022 - 05:59:42 EST


On 04.02.22 06:35, Nikita Popov wrote:
Thank you for your attention.
If you really feel that this is something that needs to be fixed, I'd
appreciate if you could find some way to reproduce it and then send a
proper patch.
I believe this would be hard to reproduce.
I just noticed this discrepancy during manual code review.
I'm considering the following facts:
1) The area 'pgt_buf' is part of the 'brk' area defined in the linker
script. It is allocated in the function 'early_alloc_pgt_buf' using
the very same 'extend_brk'. The latter is essentially a stack-based
allocator picking its memory slices from the linker defined area.
2) The allocations from 'pgt_buf' are in the stack manner too.
One can expect that these two areas (one of which is completely
contained in the other) have the same properties in view of the direct
memory mapping.

Then there is the flag 'can_use_brk_pgt' which allows usage of the
pgt_buf area if a mapped range doesn't overlap with the free space of
the pgt_buf area. In the 'init_range_memory_mapping' function we can
observe that this flag doesn't reflect the relative position between a
mapped range and the free space of the brk area as a whole:
/*
* if it is overlapping with brk pgt, we need to
* alloc pgt buf from memblock instead.
*/
can_use_brk_pgt = max(start, (u64)pgt_buf_end<<PAGE_SHIFT) >=
min(end, (u64)pgt_buf_top<<PAGE_SHIFT);
This check is simply too narrow.

So for whatever reason this flag prohibits usage of the pgt_buf area,
I believe for the exact same reason we have to avoid using brk area if
the similar condition on the free space of the brk area holds.
This _might_ be right. But, my confidence that it won't break anything
else is pretty low. It's also obviously not been tested.
Yes, I agree here. I saw it as my duty to report the possible issue.
What are these "MMU issues"?
I tried to deduce the underlying reason beyond the code fragments in
question. I presumed that checking for overlap is protecting against
some MMU issues that could affect stability of the kernel.

I've done a local test with the can_use_brk_pgt tests in
alloc_low_pages() removed and couldn't trigger any bug when running as
Xen PV guest.

This could be due to commit 0167d7d8b0beb4cf1, or some parts of early
memory management initialization (maybe for Xen PV only) have changed
since the patch introducing can_use_brk_pgt was applied.


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature