Re: [PATCH] scsi: storvsc: Fix a race in sub-channel creation that can cause panic

From: Sasha Levin
Date: Mon Nov 26 2018 - 00:39:27 EST


On Mon, Nov 26, 2018 at 12:26:17AM +0000, kys@xxxxxxxxxxxxxxxxx wrote:
From: Dexuan Cui <decui@xxxxxxxxxxxxx>

We can concurrently try to open the same sub-channel from 2 paths:

path #1: vmbus_onoffer() -> vmbus_process_offer() -> handle_sc_creation().
path #2: storvsc_probe() -> storvsc_connect_to_vsp() ->
-> storvsc_channel_init() -> handle_multichannel_storage() ->
-> vmbus_are_subchannels_present() -> handle_sc_creation().

They conflict with each other, but it was not an issue before the recent
commit ae6935ed7d42 ("vmbus: split ring buffer allocation from open"),
because at the beginning of vmbus_open() we checked newchannel->state so
only one path could succeed, and the other would return with -EINVAL.

After ae6935ed7d42, the failing path frees the channel's ringbuffer by
vmbus_free_ring(), and this causes a panic later.

Commit ae6935ed7d42 itself is good, and it just reveals the longstanding
race. We can resolve the issue by removing path #2, i.e. removing the
second vmbus_are_subchannels_present() in handle_multichannel_storage().

BTW, the comment "Check to see if sub-channels have already been created"
in handle_multichannel_storage() is incorrect: when we unload the driver,
we first close the sub-channel(s) and then close the primary channel, next
the host sends rescind-offer message(s) so primary->sc_list will become
empty. This means the first vmbus_are_subchannels_present() in
handle_multichannel_storage() is never useful.

Fixes: ae6935ed7d42 ("vmbus: split ring buffer allocation from open")
Cc: stable@xxxxxxxxxxxxxxx
Cc: Long Li <longli@xxxxxxxxxxxxx>
Cc: Stephen Hemminger <sthemmin@xxxxxxxxxxxxx>
Cc: K. Y. Srinivasan <kys@xxxxxxxxxxxxx>
Cc: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>
Signed-off-by: Dexuan Cui <decui@xxxxxxxxxxxxx>
Signed-off-by: K. Y. Srinivasan <kys@xxxxxxxxxxxxx>

Just a heads-up: ae6935ed7d42 ("vmbus: split ring buffer allocation from
open") was merged in the 4.20 merge window, so this fix won't actually
apply to any of the current stable trees.

However, it's good to have tags (fixes + cc: stable) here since this fix
might end up (for whatever reason) getting merged only for 4.21, which
will then make these tags relevant.

--
Thanks,
Sasha