Re: [PATCH v6 8/8] kvm: vmx: virtualize split lock detection

From: Xiaoyao Li
Date: Tue Mar 24 2020 - 21:12:07 EST


On 3/25/2020 8:40 AM, Thomas Gleixner wrote:
Xiaoyao Li <xiaoyao.li@xxxxxxxxx> writes:
#ifdef CONFIG_CPU_SUP_INTEL
+enum split_lock_detect_state {
+ sld_off = 0,
+ sld_warn,
+ sld_fatal,
+};
+extern enum split_lock_detect_state sld_state __ro_after_init;
+
+static inline bool split_lock_detect_on(void)
+{
+ return sld_state != sld_off;
+}

See previous reply.

+void sld_msr_set(bool on)
+{
+ sld_update_msr(on);
+}
+EXPORT_SYMBOL_GPL(sld_msr_set);
+
+void sld_turn_back_on(void)
+{
+ sld_update_msr(true);
+ clear_tsk_thread_flag(current, TIF_SLD);
+}
+EXPORT_SYMBOL_GPL(sld_turn_back_on);

First of all these functions want to be in a separate patch, but aside
of that they do not make any sense at all.

+static inline bool guest_cpu_split_lock_detect_on(struct vcpu_vmx *vmx)
+{
+ return vmx->msr_test_ctrl & MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
+}
+
static int handle_exception_nmi(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -4725,12 +4746,13 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu)
case AC_VECTOR:
/*
* Reflect #AC to the guest if it's expecting the #AC, i.e. has
- * legacy alignment check enabled. Pre-check host split lock
- * support to avoid the VMREADs needed to check legacy #AC,
- * i.e. reflect the #AC if the only possible source is legacy
- * alignment checks.
+ * legacy alignment check enabled or split lock detect enabled.
+ * Pre-check host split lock support to avoid further check of
+ * guest, i.e. reflect the #AC if host doesn't enable split lock
+ * detection.
*/
if (!split_lock_detect_on() ||
+ guest_cpu_split_lock_detect_on(vmx) ||
guest_cpu_alignment_check_enabled(vcpu)) {

If the host has split lock detection disabled then how is the guest
supposed to have it enabled in the first place?

So we need to reach an agreement on whether we need a state that host turns it off but feature is available to be exposed to guest.

@@ -6631,6 +6653,14 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
*/
x86_spec_ctrl_set_guest(vmx->spec_ctrl, 0);
+ if (static_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT) &&
+ guest_cpu_split_lock_detect_on(vmx)) {
+ if (test_thread_flag(TIF_SLD))
+ sld_turn_back_on();

This is completely inconsistent behaviour. The only way that TIF_SLD is
set is when the host has sld_state == sld_warn and the guest triggered
a split lock #AC.

Can you image the case that both host and guest set sld_state == sld_warn.

1. There is guest userspace thread causing split lock.
2. It sets TIF_SLD for the thread in guest, and clears SLD bit to re- execute the instruction in guest.
3. Then it still causes #AC since hardware SLD is not cleared. In host kvm, we call handle_user_split_lock() that sets TIF_SLD for this VMM thread, and clears hardware SLD bit. Then it enters guest and re-execute the instruction.
4. In guest, it schedules to another thread without TIF_SLD being set. it sets the SLD bit to detect the split lock for this thread. So for this purpose, we need to turn sld back on for the VMM thread, otherwise this guest vcpu cannot catch split lock any more.

'warn' means that the split lock event is registered and a printk
emitted and after that the task runs with split lock detection disabled.

It does not matter at all if the task triggered the #AC while in guest
or in host user space mode. Stop claiming that virt is special. The only
special thing about virt is, that it is using a different mechanism to
exit kernel mode. Aside of that from the kernel POV it is completely
irrelevant whether the task triggered the split lock in host user space
or in guest mode.

If the SLD mode is fatal, then the task is killed no matter what.

Please sit down and go through your patches and rethink every single
line instead of sending out yet another half baken and hastily cobbled
together pile.

To be clear, Patch 1 and 2 make sense on their own, so I'm tempted to
pick them up right now, but the rest is going to be 5.8 material no
matter what.

Alright.

Do you need me to spin a new version of patch 1 to clear SLD bit on APs if SLD_OFF?