Re: [PATCH blktests v2 2/2] nvme-rc: Cleanup fc resource before module unloading

From: Sagi Grimberg
Date: Mon May 01 2023 - 10:11:01 EST




On 4/30/23 13:34, Shinichiro Kawasaki wrote:
On Apr 19, 2023 / 13:45, Sagi Grimberg wrote:

Before we unload the module we should cleanup the fc resources first,
basically reorder the shutdown sequence to be in reverse order of the
setup path.

If this triggers a bug, then I think it is a good idea to have a
dedicated test that reproduces it if we are changing the default
behavior.

Right, though I would like to tackle one problem after the other, first get fc
working with the 'correct' order.

While at it also update the rdma stop_soft_rdma before the module
unloading for the same reasoning.

Why? it creates the wrong reverse ordering.

1. setup soft-rdma
2. setup nvme-rdma

2. teardown nvme-rdma
1. teardown soft-rdma

I don't think we need this change. I mean it is a good test
to have that the rdma device goes away underneath nvme-rdma
but it is good for a dedicated test.

I agree that the new test case is good.


I was woried about this setup sequence here:

modprobe -q nvme-"${nvme_trtype}"
if [[ "${nvme_trtype}" == "rdma" ]]; then
start_soft_rdma

The module is loaded before start_soft_rdma is started, thus I thought we should
do the reverse, first call stop_soft_rdma and the unload the module.

They should be unrelated. the safe route is to first remove the uld and
then the device.

Sagi, this comment above was not clear for me. Is Daniel's patch ok for you?

IMO, it is reasonable to "do clean-up in reverse order as setup" as a general
guide. It will reduce the chance to see module related failures when the test
cases do not expect such failures. Instead, we can have dedicated test cases for
the module load/unload order related failures. start_soft_rdma and
stop_soft_rdma do module load and unload. So I think the guide is good for those
helper functions also.

As I mentioned here, this change exercises a code path in the driver
that is a surprise unplug of the rdma device. It is equivalent to
triggering a surprise removal of the pci device normally during
nvme-pci test teardown. While this is worth testing, I'm not sure we
want the default behavior to do that, but rather add dedicated tests for
it.

Hence, my suggestion was to leave nvme-rdma as is.