Re: A logical error in arch/x86/mm/init.c

From: Dave Hansen
Date: Thu Feb 03 2022 - 12:27:14 EST


On 2/3/22 02:30, Nikita Popov wrote:
> It appears that there is a logical error in arch/x86/mm/init.c in the
> master branch. Although it is unlikely to occur in practice. I
> discovered it while studying source code in that file.

I looked at this a bit. It seems to have originated in:

> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=c9b3234a6aba

which isn't the best changelog in the history of the world. It's also
fixing a boot problem in a configuration that I can't readily reproduce
(Xen PV guest).

There's one thing from the old changelog that's confusing me:

> But after we get back that page for pgt, it tiggers one bug in pgt allocation
> with xen: We need to avoid to use page as pgt to map range that is
> overlapping with that pgt page.

and presumably alluding to the same issue from your mail:

> ... which can incur MMU issues if that page is allocated as a page
> directory)

What are these "MMU issues"?

> In my opinion one of the simplest fixes here is to completely remove
> the following lines:
> if (!ret && can_use_brk_pgt)
> ret = __pa(extend_brk(PAGE_SIZE * num, PAGE_SIZE));

This _might_ be right. But, my confidence that it won't break anything
else is pretty low. It's also obviously not been tested.

I'd be much more confident if this issue was reproduced, even if the
reproduction was contrived by doing something like purposefully
exhausting the pgt_buf_* area.

If you really feel that this is something that needs to be fixed, I'd
appreciate if you could find some way to reproduce it and then send a
proper patch.