RE: [PATCH 2/4] drivers/clocksource/hyper-v: Introduce TSC MSR register structure

From: Michael Kelley (LINUX)
Date: Wed Nov 02 2022 - 17:27:23 EST


From: Stanislav Kinsburskiy <stanislav.kinsburskiy@xxxxxxxxx> Sent: Wednesday, November 2, 2022 1:58 PM

> ср, 2 нояб. 2022 г. в 13:30, Michael Kelley (LINUX) <mailto:mikelley@xxxxxxxxxxxxx>:
> From: Stanislav Kinsburskiy <mailto:stanislav.kinsburskiy@xxxxxxxxx>  Sent: Wednesday, November 2, 2022 12:26 PM
>
> > > It makes sense to have the tsc_page global variable so that we can
> > > handle the root partition and guest partition cases with common code,
> > > even though the TSC page memory originates differently in the two cases.
> > >
> > > But do we also need a tsc_pfn global variable and getter function?  When
> > > the PFN is needed, conversion from the tsc_page virtual address to the PFN
> > > isn't hard, and such a conversion is needed in only a couple of places.  To me,
> > > it's simpler to keep a single global variable and getter function (i.e.,
> > > hv_get_tsc_page), and do the conversions where needed.   Adding tsc_pfn
> > > and the getter function introduces a fair amount of code churn for not much
> > > benefit.  It's a judgment call, but that's my $.02.
> >
> > As I replied to Anirudh , AFAIK virt_to_phys doesn't work for remapped pages.
> > Another option would be to read the MSR each time PFN has to be returned to
> > the vvar mapping function (i.e. on every fork), which introduces unnecessary
> > performance regression..
> > Another modification would be to make pfn a static variable and initialize it
> > once in hv_get_tsc_pfn() on first access. But I like this implementation less.

> > Valid point about virt_to_phys().  But virt_to_hvpfn() does the right thing.  It
> > distinguishes between kernel addresses in the main linear mapping and
> > vmalloc() addresses, which is what you get from memremap().  But I haven't
> > looked through all the places virt_to_hvpfn() would be needed to make sure
> > it's safe to use.
>
> Yeah, I guess virt_to_hvpfn() will do.
> But I'm not sure it the current code should be reworked to use it: it would save only a
> few lines of code, but will remove the explicit distinguishment between root and guest
> partitions, currently reflected in the code.
> Please, let me know if you insist on reworking the series to use virt_to_hvpfn().

Your call. I'm OK with leaving things "as is" due to the additional complexity
of dealing with the vmalloc() address that comes from memremap().
 
> > However, thinking about virt_to_hvpfn(), there's a problem with Anirudh's
> > earlier patch set that started using __phys_to_pfn().  That won't work correctly
> > if the guest page size is not 4K because it will return a PFN based on the guest
> > page size, not based on Hyper-V's expectation that the PFN is based on a
> > 4K page size.  So that needs to be fixed either way.

> Could you elaborate more on the problem? 
 
The key is to recognize that PFNs are inherently interpreted in the context
of the page size. Consider Guest Physical Address (GPA) 0x12340000.
If the page size is 4096, the PFN is 0x12340. If the page size is 64K, the PFN
is 0x1234. Hyper-V is always expecting PFNs in the context of a page size
of 4096. But Linux guests running on Hyper-V on ARM64 could have a
guest page size of 64K, even though Hyper-V itself is using a page size
of 4096. For my example, in an ARM64 VM with 64K page size,
__phys_to_pfn(0x12340000) would return 0x1234. If that value is
stored in the PFN field of the MSR, Hyper-V will think the GPA is
0x1234000 when it should be 0x12340000.

Michael