Re: [RFC 2/2] x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader

From: Marcelo Tosatti
Date: Tue Jan 06 2015 - 15:21:22 EST


On Tue, Jan 06, 2015 at 11:49:09AM -0800, Andy Lutomirski wrote:
> > What is the point with the new flags bit though?
>
> To try to work around the problem on old hosts. I'm not at all
> convinced that this is worthwhile or that it helps, though.

Don't think so. Just fix the host bug.

> >> Also, if you do this, can you also make setting and clearing
> >> STABLE_BIT properly atomic across all vCPUs? Or at least do something
> >> like setting it last and clearing it first on vPCU 0?
> >
> > If the version "seqlock" works properly across vCPUs, why do you need
> > STABLE_BIT "properly atomic" ?
> >
> > Please define what you mean by "properly atomic".
> >
>
> I'd like to be able to rely using vCPU 0's pvti even from other vCPUs
> in the vdso if the stable bit is set. That means that the host should
> avoid doing things like migrating the guest, clearing the stable bit
> for vCPU 1, resuming vCPU 1, and waiting long enough to clear the
> stable bit for vCPU 0 that vCPU 1's vdso code could see invalid data
> and return a bad timestamp.
>
> Maybe this scenario is impossible, but getting rid of any getcpu-like
> operation in the vdso has really nice benefits.

You can park every vCPU in host while updating vCPU-0's timestamp.

See kvm_gen_update_masterclock:

+ /* no guest entries from this point */
+ pvclock_update_vm_gtod_copy(kvm);

- touch guest memory

+ /* guest entries allowed */
+ kvm_for_each_vcpu(i, vcpu, kvm)
+ clear_bit(KVM_REQ_MCLOCK_INPROGRESS, &vcpu->requests);

> It's faster and it
> lets us guarantee that the vdso's pvti data fits in a single page.
> The latter means that we can easily make it work like the hpet
> mapping, which gets us 32-bit support and will *finally* let us turn
> off user access to the fixmap if vsyscall=none.
>
> (We can, of course, still do this if the pvti data needs to be an
> array, but it's messier.)
>
> --Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/