Re: [PATCH v2 04/13] KVM: x86: dynamic kvm_apic_map

From: Yang Zhang
Date: Mon Jul 11 2016 - 23:27:26 EST


On 2016/7/11 23:52, Radim KrÄmÃÅ wrote:
2016-07-11 16:14+0200, Paolo Bonzini:
On 11/07/2016 15:48, Radim KrÄmÃÅ wrote:
I guess the easiest solution is to replace kvm_apic_id with a field in
struct kvm_lapic, which is already shifted right by 24 in xAPIC mode.

(I guess the fewest LOC is to look at vcpu->vcpu_id, which is equal to
x2apic id. xapic id cannot be greater than 255 and all of those are
covered by the initial value of max_id.)

Yes, this would work too. Or even better perhaps, look at vcpu->vcpu_id
in kvm_apic_id?

APIC ID is writeable in xAPIC mode, which would make the implementation
weird without an extra variable. Always read-only APIC ID would be
best, IMO.

Or we can just simply put the assignment of apic_base to the end.

Yes, this would work, I'd also remove recalculates from
kvm_apic_set_*apic_id() and add a compiler barrier with comment for good
measure, even though set_virtual_x2apic_mode() serves as one.

Why a compiler barrier?

True, it should be a proper pair of smp_wmb() and smp_rmb() in
recalculate ... and current kvm_apic_id() reads in a wrong order, so
changing the apic_base alone update wouldn't get rid of this race.

(What makes a bit wary is that it doesn't avoid the same problem if we
changed KVM to reset apic id to xapic id first when disabling apic.)

Yes, this is why I prefer it fixed once and for all in kvm_apic_id...

Seems most reasonable. We'll need to be careful to have a correct value
in the apic page, but there shouldn't be any races there.

Yes, it is more reasonable.


Races in recalculation and APIC ID changes also lead to invalid physical
maps, which haven't been taken care of properly ...

Hmm, true, but can be fixed separately. Probably the mutex should be
renamed so that it can be taken outside recalculate_apic_map...

Good point, it'll make reasoning easier and shouldn't introduce any
extra scalability issues.

If we can ensure all the updates to LDR,DFR,ID and apic mode are in correct sequence and followed with apic map recalculation, it should be enough. It's guest's responsibility to ensure the apic updating must
happen in right time(means no interrupt is in flying), otherwise the interrupt may deliver to wrong VCPU.

--
best regards
yang