Re: [PATCH v2 00/19] scsi: libsas: Remove in_interrupt() check

From: John Garry
Date: Thu Jan 14 2021 - 04:53:31 EST


On 12/01/2021 17:33, Ahmed S. Darwish wrote:
On Tue, Jan 12, 2021 at 04:00:57PM +0000, John Garry wrote:
...
I boot-tested on my machines which have hisi_sas v2 and v3 hw, and it's ok.
I will ask some guys to test a bit more.

Thanks a lot!

And generally the changes look ok. But I just have a slight concern that we
don't pass the gfp_flags all the way from the origin caller.

So we have some really long callchains, for example:

host.c: sci_controller_error_handler(): atomic, irq handler (*)
OR host.c: sci_controller_completion_handler(), atomic, tasklet (*)
-> sci_controller_process_completions()
-> sci_controller_unsolicited_frame()
-> phy.c: sci_phy_frame_handler()
-> sci_change_state(SCI_PHY_SUB_AWAIT_SAS_POWER)
-> sci_phy_starting_await_sas_power_substate_enter()
-> host.c: sci_controller_power_control_queue_insert()
-> phy.c: sci_phy_consume_power_handler()
-> sci_change_state(SCI_PHY_SUB_FINAL)
-> sci_change_state(SCI_PHY_SUB_FINAL)
-> sci_controller_event_completion()
-> phy.c: sci_phy_event_handler()
-> sci_phy_start_sata_link_training()
-> sci_change_state(SCI_PHY_SUB_AWAIT_SATA_POWER)
-> sci_phy_starting_await_sata_power_substate_enter
-> host.c: sci_controller_power_control_queue_insert()
-> phy.c: sci_phy_consume_power_handler()
-> sci_change_state(SCI_PHY_SUB_FINAL)

So if someone rearranges the code later, adds new callchains, etc., it could
be missed that the context may have changed than what we assume at the
bottom. But then passing the flags everywhere is cumbersome, and all the
libsas users see little or no significant changes anyway, apart from a
couple.

The deep call chains like the one you've quoted are all within the isci
Intel driver (patches #5 => #7), due to the*massive* state transitions
that driver has. But as the commit logs of these three patches show,
almost all of such transitions happened under atomic context anyway and
GFP_ATOMIC was thus used.

The GFP_KERNEL call-chains were all very simple: a workqueue, functions
already calling msleep() or wait_event_timeout() two or three lines
nearby, and so on.

All the other libsas clients (that is, except isci) also had normal call
chains that were IMHO easy to follow.

To me, the series looks fine. Well, the end result - I didn't go through patch by patch. So:

Reviewed-by: John Garry <john.garry@xxxxxxxxxx>

I'm still hoping some guys are testing a bit for me, but I'll let you know if any problem.

As an aside, your analysis showed some quite poor usage of spinlocks in some drivers, specifically grabbing a lock and then calling into a depth of 3 or 4 functions.

Thanks,
John