Re: [PATCH v4 2/3] x86/xen/time: setup vcpu 0 time info page

From: Boris Ostrovsky
Date: Wed Sep 27 2017 - 16:24:06 EST


On 09/27/2017 11:26 AM, Joao Martins wrote:
> On 09/27/2017 03:40 PM, Boris Ostrovsky wrote:
>>> +static void xen_setup_vsyscall_time_info(void)
>>> +{
>>> + struct vcpu_register_time_memory_area t;
>>> + struct pvclock_vsyscall_time_info *ti;
>>> + struct pvclock_vcpu_time_info *pvti;
>>> + int ret;
>>> +
>>> + pvti = &__this_cpu_read(xen_vcpu)->time;
>>> +
>>> + /*
>>> + * We check ahead on the primary time info if this
>>> + * bit is supported hence speeding up Xen clocksource.
>>> + */
>>> + if (!(pvti->flags & PVCLOCK_TSC_STABLE_BIT))
>>> + return;
>>> +
>>> + pvclock_set_flags(PVCLOCK_TSC_STABLE_BIT);
>> Is it OK to have this flag set if anything below fails?
>>
> Yes - if anything below fails it will only affect userspace mapped page.

Then should it be set somewhere else, like in xen_time_init()?

-boris

> What I
> do above is just allowing xen clocksource to use/check that bit (consequently
> speeding up sched_clock) given the necessary support is there in the master
> copy. The secondary copy (i.e. what's being set up below, mapped/used in vdso)
> has the same data from the master copy, just separate memory regions. The checks
> below are just for the unlikely cases of failing to register the secondary copy
> or if its content were to differ from master copy in future releases - and
> therefore we handle those more gracefully.
>
>> (I can see in the changelog that apparently at some point I've asked
>> about this at v1 but I can't remember/find what exactly it was)
>>
>>> +
>>> + ti = (struct pvclock_vsyscall_time_info *)get_zeroed_page(GFP_KERNEL);
>>> + if (!ti)
>>> + return;
>>> +
>>> + t.addr.v = &ti->pvti;
>>> +
>>> + ret = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_time_memory_area, 0, &t);
>>> + if (ret) {
>>> + pr_notice("xen: VCLOCK_PVCLOCK not supported (err %d)\n", ret);
>>> + free_page((unsigned long)ti);
>>> + return;
>>> + }
>>> +
>>> + /*
>>> + * If the check above succedded this one should too since it's the
>>> + * same data on both primary and secondary time infos just different
>>> + * memory regions. But we still check it in case hypervisor is buggy.
>>> + */
>>> + pvti = &ti->pvti;
>>> + if (!(pvti->flags & PVCLOCK_TSC_STABLE_BIT)) {
>>> + t.addr.v = NULL;
>>> + ret = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_time_memory_area,
>>> + 0, &t);
>>> + if (!ret)
>>> + free_page((unsigned long)ti);
>>> +
>>> + pr_notice("xen: VCLOCK_PVCLOCK not supported (tsc unstable)\n");
>>> + return;
>>> + }
>>> +
>>> + xen_clock = ti;
>>> + pvclock_set_pvti_cpu0_va(xen_clock);
>>> +
>>> + xen_clocksource.archdata.vclock_mode = VCLOCK_PVCLOCK;
>>> +}
>>> +