Re: [PATCH tip/core/rcu 02/20] x86: Use common outgoing-CPU-notification code

From: Boris Ostrovsky
Date: Tue Mar 03 2015 - 15:15:30 EST


On 03/03/2015 02:42 PM, Paul E. McKenney wrote:
On Tue, Mar 03, 2015 at 02:17:24PM -0500, Boris Ostrovsky wrote:
On 03/03/2015 12:42 PM, Paul E. McKenney wrote:
}
@@ -511,7 +508,8 @@ static void xen_cpu_die(unsigned int cpu)
schedule_timeout(HZ/10);
}
- cpu_die_common(cpu);
+ (void)cpu_wait_death(cpu, 5);
+ /* FIXME: Are the below calls really safe in case of timeout? */


Not for HVM guests (PV guests will only reach this point after
target cpu has been marked as down by the hypervisor).

We need at least to have a message similar to what native_cpu_die()
prints on cpu_wait_death() failure. And I think we should not call
the two routines below (three, actually --- there is also
xen_teardown_timer() below, which is not part of the diff).

-boris


xen_smp_intr_free(cpu);
xen_uninit_lock_cpu(cpu);

So something like this, then?

if (cpu_wait_death(cpu, 5)) {
xen_smp_intr_free(cpu);
xen_uninit_lock_cpu(cpu);
xen_teardown_timer(cpu);
}

else
pr_err("CPU %u didn't die...\n", cpu);



Easy change for me to make if so!

Or do I need some other check for HVM-vs.-PV guests, and, if so, what
would that check be? And also if so, is it OK to online a PV guest's
CPU that timed out during its previous offline?


I believe PV VCPUs will always be CPU_DEAD by the time we get here since we are (indirectly) waiting for this in the loop at the beginning of xen_cpu_die():

'while (xen_pv_domain() && HYPERVISOR_vcpu_op(VCPUOP_is_up, cpu, NULL))' will exit only after 'HYPERVISOR_vcpu_op(VCPUOP_down, smp_processor_id()' in xen_play_dead(). Which happens after play_dead_common() has marked the cpu as CPU_DEAD.

So no test is needed.

Thanks.
-boris


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/