Re: [PATCH 0/4] x86/Hyper-V: Unload vmbus channel in hv panic callback

From: Tianyu Lan
Date: Thu Mar 19 2020 - 04:24:23 EST


On 3/18/2020 1:35 AM, Wei Liu wrote:
On Tue, Mar 17, 2020 at 06:25:20AM -0700, ltykernel@xxxxxxxxx wrote:
From: Tianyu Lan <Tianyu.Lan@xxxxxxxxxxxxx>

Customer reported Hyper-V VM still responded network traffic
ack packets after kernel panic with kernel parameter "panic=0â.
This becauses vmbus driver interrupt handler still works
on the panic cpu after kernel panic. Panic cpu falls into
infinite loop of panic() with interrupt enabled at that point.
Vmbus driver can still handle network traffic.

This confuses remote service that the panic system is still
alive when it gets ack packets. Unload vmbus channel in hv panic
callback and fix it.

vmbus_initiate_unload() maybe double called during panic process
(e.g, hyperv_panic_event() and hv_crash_handler()). So check
and set connection state in vmbus_initiate_unload() to resolve
reenter issue.

Signed-off-by: Tianyu Lan <Tianyu.Lan@xxxxxxxxxxxxx>
---
drivers/hv/channel_mgmt.c | 5 +++++
drivers/hv/vmbus_drv.c | 17 +++++++++--------
2 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 0370364169c4..893493f2b420 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -839,6 +839,9 @@ void vmbus_initiate_unload(bool crash)
{
struct vmbus_channel_message_header hdr;
+ if (vmbus_connection.conn_state == DISCONNECTED)
+ return;
+
/* Pre-Win2012R2 hosts don't support reconnect */
if (vmbus_proto_version < VERSION_WIN8_1)
return;
@@ -857,6 +860,8 @@ void vmbus_initiate_unload(bool crash)
wait_for_completion(&vmbus_connection.unload_event);
else
vmbus_wait_for_unload();
+
+ vmbus_connection.conn_state = DISCONNECTED;

This is only set at the end of the function. I don't see how this solve
the re-entrant issue with the check at the beginning. Do I miss anything
here?


For this issue, vmbus_initiate_unload() maybe called on the panic vcpu
twice and so just split check and set conn_state.

Maybe this function should check and set the state to
DISCONNECTING/DISCONNECTED at the beginning of this function?

Yes, Vitaly also gave suggestion to use "xchg" to check and set
conn_state. Will update in the next version.