Re: PING [PATCH 3/3] x86: decouple pat and mtrr handling

From: Chuck Zmudzinski
Date: Sat Aug 13 2022 - 12:56:55 EST


On 7/19/22 11:15 AM, Borislav Petkov wrote:
> On Fri, Jul 15, 2022 at 04:25:49PM +0200, Juergen Gross wrote:
> > Today PAT is usable only with MTRR being active, with some nasty tweaks
> > to make PAT usable when running as Xen PV guest, which doesn't support
> > MTRR.
> >
> > The reason for this coupling is, that both, PAT MSR changes and MTRR
> > changes, require a similar sequence and so full PAT support was added
> > using the already available MTRR handling.
> >
> > Xen PV PAT handling can work without MTRR, as it just needs to consume
> > the PAT MSR setting done by the hypervisor without the ability and need
> > to change it. This in turn has resulted in a convoluted initialization
> > sequence and wrong decisions regarding cache mode availability due to
> > misguiding PAT availability flags.
> >
> > Fix all of that by allowing to use PAT without MTRR and by adding an
> > environment dependent PAT init function.
>
> Aha, there's the explanation I was looking for.
>
> > diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> > index 0a1bd14f7966..3edfb779dab5 100644
> > --- a/arch/x86/kernel/cpu/common.c
> > +++ b/arch/x86/kernel/cpu/common.c
> > @@ -2408,8 +2408,8 @@ void __init cache_bp_init(void)
> > {
> > if (IS_ENABLED(CONFIG_MTRR))
> > mtrr_bp_init();
> > - else
> > - pat_disable("PAT support disabled because CONFIG_MTRR is disabled in the kernel.");
> > +
> > + pat_cpu_init();
> > }
> >
> > void cache_ap_init(void)
> > @@ -2417,7 +2417,8 @@ void cache_ap_init(void)
> > if (cache_aps_delayed_init)
> > return;
> >
> > - mtrr_ap_init();
> > + if (!mtrr_ap_init())
> > + pat_ap_init_nomtrr();
> > }
>
> So I'm reading this as: if it couldn't init AP's MTRRs, init its PAT.
>
> But currently, the code sets the MTRRs for the delayed case or when the
> CPU is not online by doing ->set_all and in there it sets first MTRRs
> and then PAT.
>
> I think the code above should simply try the two things, one after the
> other, independently from one another.
>
> And I see you've added another stomp machine call for PAT only.
>
> Now, what I think the design of all this should be, is:
>
> you have a bunch of things you need to do at each point:
>
> * cache_ap_init
>
> * cache_aps_init
>
> * ...
>
> Now, in each those, you look at whether PAT or MTRR is supported and you
> do only those which are supported.
>
> Also, the rendezvous handler should do:
>
> if MTRR:
> do MTRR specific stuff
>
> if PAT:
> do PAT specific stuff
>
> This way you have clean definitions of what needs to happen when and you
> also do *only* the things that the platform supports, by keeping the
> proper order of operations - I believe MTRRs first and then PAT.
>
> This way we'll get rid of that crazy maze of who calls what and when.
>
> But first we need to define those points where stuff needs to happen and
> then for each point define what stuff needs to happen.
>
> How does that sound?
>
> Thx.
>

Hi Thorsten,

IMHO, silence here is unacceptable given that this is supposed to
be fixing a regression and not just adding a new feature or
re-working the code in a case where there is no regression.

The regression was first reported on May 4, 2022, now over
three months ago:

https://lore.kernel.org/regressions/YnHK1Z3o99eMXsVK@mail-itl/

Why has Juergen not at least responded in some way to the
comments that Boris has made here? Why has Boris not
pinged Juergen by now, which is almost four weeks after his
comment and over three months from the first report of the
regression? IMHO, both Juergen and Boris are not treating
this with the priority of a regression fix. At the very least,
they should reaffirm their commitment to fix the regression
in a timely manner or explain what factors demand that the
Linus regression rule be set aside in this case.

There are valid reasons to delay a fix, but in all the discussion
of the various patches that have been proposed to fix this
regression, no maintainer has yet given a clear and reasonable
explanation for why this is not getting a higher priority from the
developers.

Some developers, (Dave, Luto, and Peter) have ignored a fix
proposed by Jan Beulich as you pointed out in an earlier message
to Jan Beulich here:

https://lore.kernel.org/lkml/4c8c9d4c-1c6b-8e9f-fa47-918a64898a28@xxxxxxxxxxxxx/

To his credit, Jan Beulich replied to your message in a reasonable
manner but he also could not explain why Dave, Luto, and Peter
ignored Jan's patch and remain silent in the discussion of the possible
fixes for this regression. I note also that the original report of the
regression identified a specific commit that also fixes the regression
if that bad commit is reverted, and that commit is also mentioned in
the aforementioned message to Jan about his proposed fix. It is a commit
that lives in the i915 Intel GPU/DRM driver, commit bdd8b6c98239, and
my testing confirms the regression can also be fixed by reverting
bdd8b6c98239 instead of applying Jan Beulich's patch that was the
subject of the aforementioned message from you to Jan Beulich where
you also expressed your dissatisfaction with the silence of some
developers (Dave, Lotu, and Peter) when there is a regression that needs
fixing.

Why their silence? In that same message, you pondered that it might
be necessary to bring this matter to Linus' attention. The developers'
silence makes me think this regression is a regression the developers
do not want to fix. And that would be a clear violation of the Linux
regression rule if it were true. So, Thorsten, I think it is time for you to
elevate this to Linus if the developers do not clearly explain why they
are ignoring this again.

Best regards,

Chuck