Re: [PATCH v3 1/2] x86/mm: Add an option to change the padding used for the physical memory mapping

From: Ingo Molnar
Date: Wed Sep 19 2018 - 08:48:14 EST



* Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:

> On Wed, 19 Sep 2018, Ingo Molnar wrote:
> > * Masayoshi Mizuma <msys.mizuma@xxxxxxxxx> wrote:
> >
> > > Ping...
> > > I would appreciate if someone could review it because this patch
> > > fixes the real memory hotplug issue...
> >
> > Yeah, so I generally try to resist random new boot options that
> > work around real bugs, so please convince me that this patch
> > is the best option:
> >
> > >
> > > On Tue, Sep 04, 2018 at 11:11:40AM -0400, Masayoshi Mizuma wrote:
> > > > From: Masayoshi Mizuma <m.mizuma@xxxxxxxxxxxxxx>
> > > >
> > > > If each node of physical memory layout has huge space for hotplug,
> > > > the padding used for the physical memory mapping section is not enough.
> > > > For exapmle of the layout:
> > > > SRAT: Node 6 PXM 4 [mem 0x100000000000-0x13ffffffffff] hotplug
> > > > SRAT: Node 7 PXM 5 [mem 0x140000000000-0x17ffffffffff] hotplug
> > > > SRAT: Node 2 PXM 6 [mem 0x180000000000-0x1bffffffffff] hotplug
> > > > SRAT: Node 3 PXM 7 [mem 0x1c0000000000-0x1fffffffffff] hotplug
> > > >
> > > > We can increase the padding by CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING,
> > > > however, the needed padding size depends on the system environment.
> > > > The kernel option is better than changing the config.
> > > >
> > > > Change log from v2:
> > > > - Simplify the description. As Baoquan said, this is simillar SGI UV issue,
> > > > but a little different. Remove SGI UV description.
> >
> > Could you please explain it a bit better where the higher padding requirement comes from?
> >
> > 'system environment' is very opaque.
>
> As I understand it, it's depending on the actual physical characteristics
> of the machine. So setting a fixed value in Kconfig might work for one, but
> not for others and having a command line option allows to tweak that at
> boot time and having a common kernel image.
>
> Ideally we would calculate that from SRAT, but AFAICT SRAT is not available
> at the point where this needs to be done.

Yeah, so could we at least do something like this:

- See whether using the maximum padding as the new default padding would work for everyone?
A bit more virtual memory used, or are there other costs as well?

- Add checking code to the later SRAT case to at least _detect_ bad padding after the fact.
We don't utilize RAM with bad padding until that, right?

- Add 'quirk' to the name of the boot parameter, to make it clear that this is really due to
suboptimal communication between the firmware and the kernel.

Hm?

Thanks,

Ingo