Re: [PATCH V15 00/22] mmc: Add Command Queue support

From: Ulf Hansson
Date: Wed Nov 29 2017 - 10:47:20 EST


Hi Adrian,

On 29 November 2017 at 14:40, Adrian Hunter <adrian.hunter@xxxxxxxxx> wrote:
> Hi
>
> Here is V15 of the hardware command queue patches without the software
> command queue patches, now using blk-mq and now with blk-mq support for
> non-CQE I/O.

I have applied patches 1->19 for next. Deferring patch 21->23 for a while.

For those patches that was more or less the same as in v14, I added Linus' ack.

Hopefully we get some help for the community to test this series on
different HW (and I will be checking kernelci's boot reports). I
haven't added Bartlomiej's tested-by and neither Linus' (because of
the changes that has been made), so I hoping that will happen sooner
or later.

Moreover, I will gladly add more peoples acks/reviewed-by and
tested-by tags, at any point during this release cycle.

Thanks and kind regards
Uffe

>
> V14 included a number of fixes to existing code, changes to default to
> blk-mq, and adds patches to remove legacy code.
>
> HW CMDQ offers 25% - 50% better random multi-threaded I/O. I see a slight
> 2% drop in sequential read speed but no change to sequential write.
>
> Non-CQE blk-mq showed a 3% decrease in sequential read performance. This
> seemed to be coming from the inferior latency of running work items compared
> with a dedicated thread. Hacking blk-mq workqueue to be unbound reduced the
> performance degradation from 3% to 1%.
>
> While we should look at changing blk-mq to give better workqueue performance,
> a bigger gain is likely to be made by adding a new host API to enable the
> next already-prepared request to be issued directly from within ->done()
> callback of the current request.
>
> Changes since V14:
> mmc: block: Fix missing blk_put_request()
> mmc: block: Check return value of blk_get_request()
> mmc: core: Do not leave the block driver in a suspended state
> mmc: block: Ensure that debugfs files are removed
> Dropped because they have been applied
> mmc: block: Use data timeout in card_busy_detect()
> Replaced by other patches
> mmc: block: Add blk-mq support
> Rename mmc_blk_ss_read() to mmc_blk_read_single()
> Add more error handling to single sector read
> Let mmc_blk_mq_complete_rq() cater for requests already "updated" by recovery
> Rename mmc_blk_mq_acct_req_done() to mmc_blk_mq_dec_in_flight()
> Add comments about synchronization
> Add comment about not dispatching in parallel
> Add comment about the queue depth
> mmc: block: Add CQE support
> Add coment about CQE queue depth
> mmc: block: blk-mq: Add support for direct completion
> Rename mmc_queue_direct_complete() to mmc_host_done_complete()
> Rename MMC_CAP_DIRECT_COMPLETE to MMC_CAP_DONE_COMPLETE
> mmc: block: blk-mq: Separate card polling from recovery
> Ensure to report gen_err as an error
> mmc: block: Make card_busy_detect() accumulate all response error bits
> Patch moved later in the patch set and adjusted accordingly
> mmc: block: blk-mq: Check error bits and save the exception bit when polling card busy
> Adjusted due to patch re-ordering
> mmc: block: Check the timeout correctly in card_busy_detect()
> New patch.
> mmc: block: Add timeout_clks when calculating timeout
> New patch.
> mmc: block: Reduce polling timeout from 10 minutes to 10 seconds
> New patch.
>
> Changes since V13:
> mmc: block: Fix missing blk_put_request()
> New patch.
> mmc: block: Check return value of blk_get_request()
> New patch.
> mmc: core: Do not leave the block driver in a suspended state
> New patch.
> mmc: block: Ensure that debugfs files are removed
> New patch.
> mmc: block: No need to export mmc_cleanup_queue()
> New patch.
> mmc: block: Simplify cleaning up the queue
> New patch.
> mmc: block: Use data timeout in card_busy_detect()
> New patch.
> mmc: block: Check for transfer state in card_busy_detect()
> New patch.
> mmc: block: Make card_busy_detect() accumulate all response error bits
> New patch.
> mmc: core: Make mmc_pre_req() and mmc_post_req() available
> New patch.
> mmc: core: Add parameter use_blk_mq
> Default to y
> mmc: block: Add blk-mq support
> Wrap blk_mq_end_request / blk_end_request_all
> Rename mmc_blk_rw_recovery -> mmc_blk_mq_rw_recovery
> Additional parentheses to '==' expressions
> Use mmc_pre_req() / mmc_post_req()
> Fix missing tuning release on error after mmc_start_request()
> Expand comment about timeouts
> Allow for possibility that the queue is quiesced when removing
> Ensure complete_work is flushed when removing
> mmc: block: Add CQE support
> Additional parentheses to '==' expressions
> mmc: block: blk-mq: Check error bits and save the exception bit when polling card busy
> Replaces patch "Stop using card_busy_detect()" retaining card_busy_detect()
> mmc: block: blk-mq: Stop using legacy recovery
> Allow for SPI
> mmc: mmc_test: Do not use mmc_start_areq() anymore
> New patch.
> mmc: core: Remove option not to use blk-mq
> New patch.
> mmc: block: Remove code no longer needed after the switch to blk-mq
> New patch.
> mmc: core: Remove code no longer needed after the switch to blk-mq
> New patch.
>
> Changes since V12:
> mmc: block: Add error-handling comments
> New patch.
> mmc: block: Add blk-mq support
> Use legacy error handling
> mmc: block: Add CQE support
> Re-base
> mmc: block: blk-mq: Add support for direct completion
> New patch.
> mmc: block: blk-mq: Separate card polling from recovery
> New patch.
> mmc: block: blk-mq: Stop using card_busy_detect()
> New patch.
> mmc: block: blk-mq: Stop using legacy recovery
> New patch.
>
> Changes since V11:
> Split "mmc: block: Add CQE and blk-mq support" into 2 patches
>
> Changes since V10:
> mmc: core: Remove unnecessary host claim
> mmc: core: Introduce host claiming by context
> mmc: core: Add support for handling CQE requests
> mmc: mmc: Enable Command Queuing
> mmc: mmc: Enable CQE's
> mmc: block: Use local variables in mmc_blk_data_prep()
> mmc: block: Prepare CQE data
> mmc: block: Factor out mmc_setup_queue()
> mmc: core: Add parameter use_blk_mq
> mmc: core: Export mmc_start_bkops()
> mmc: core: Export mmc_start_request()
> mmc: core: Export mmc_retune_hold_now() and mmc_retune_release()
> Dropped because they have been applied
> mmc: block: Add CQE and blk-mq support
> Extend blk-mq support for asynchronous read / writes to all host
> controllers including those that require polling. The direct
> completion path is still available but depends on a new capability
> flag.
> Drop blk-mq support for synchronous read / writes.
>
> Venkat Gopalakrishnan (1):
> mmc: cqhci: support for command queue enabled host
>
> Changes since V9:
> mmc: block: Add CQE and blk-mq support
> - reinstate mq support for REQ_OP_DRV_IN/OUT that was removed because
> it was incorrectly assumed to be handled by the rpmb character device
> - don't check for rpmb block device anymore
> mmc: cqhci: support for command queue enabled host
> Fix cqhci_set_irqs() as per Haibo Chen
>
> Changes since V8:
> Re-based
> mmc: core: Introduce host claiming by context
> Slightly simplified as per Ulf
> mmc: core: Export mmc_retune_hold_now() and mmc_retune_release()
> New patch.
> mmc: block: Add CQE and blk-mq support
> Fix missing ->post_req() on the error path
>
> Changes since V7:
> Re-based
> mmc: core: Introduce host claiming by context
> Slightly simplified
> mmc: core: Add parameter use_blk_mq
> New patch.
> mmc: core: Remove unnecessary host claim
> New patch.
> mmc: core: Export mmc_start_bkops()
> New patch.
> mmc: core: Export mmc_start_request()
> New patch.
> mmc: block: Add CQE and blk-mq support
> Add blk-mq support for non_CQE requests
>
> Changes since V6:
> mmc: core: Introduce host claiming by context
> New patch.
> mmc: core: Move mmc_start_areq() declaration
> Dropped because it has been applied
> mmc: block: Fix block status codes
> Dropped because it has been applied
> mmc: host: Add CQE interface
> Dropped because it has been applied
> mmc: core: Turn off CQE before sending commands
> Dropped because it has been applied
> mmc: block: Factor out mmc_setup_queue()
> New patch.
> mmc: block: Add CQE support
> Drop legacy support and add blk-mq support
>
> Changes since V5:
> Re-based
> mmc: core: Add mmc_retune_hold_now()
> Dropped because it has been applied
> mmc: core: Add members to mmc_request and mmc_data for CQE's
> Dropped because it has been applied
> mmc: core: Move mmc_start_areq() declaration
> New patch at Ulf's request
> mmc: block: Fix block status codes
> Another un-related patch
> mmc: host: Add CQE interface
> Move recovery_notifier() callback to struct mmc_request
> mmc: core: Add support for handling CQE requests
> Roll __mmc_cqe_request_done() into mmc_cqe_request_done()
> Move function declarations requested by Ulf
> mmc: core: Remove unused MMC_CAP2_PACKED_CMD
> Dropped because it has been applied
> mmc: block: Add CQE support
> Add explanation to commit message
> Adjustment for changed recovery_notifier() callback
> mmc: cqhci: support for command queue enabled host
> Adjustment for changed recovery_notifier() callback
> mmc: sdhci-pci: Add CQHCI support for Intel GLK
> Add DCMD capability for Intel controllers except GLK
>
> Changes since V4:
> mmc: core: Add mmc_retune_hold_now()
> Add explanation to commit message.
> mmc: host: Add CQE interface
> Add comments to callback declarations.
> mmc: core: Turn off CQE before sending commands
> Add explanation to commit message.
> mmc: core: Add support for handling CQE requests
> Add comments as requested by Ulf.
> mmc: core: Remove unused MMC_CAP2_PACKED_CMD
> New patch.
> mmc: mmc: Enable Command Queuing
> Adjust for removal of MMC_CAP2_PACKED_CMD.
> Add a comment about Packed Commands.
> mmc: mmc: Enable CQE's
> Remove un-necessary check for MMC_CAP2_CQE
> mmc: block: Use local variables in mmc_blk_data_prep()
> New patch.
> mmc: block: Prepare CQE data
> Adjust due to "mmc: block: Use local variables in mmc_blk_data_prep()"
> Remove priority setting.
> Add explanation to commit message.
> mmc: cqhci: support for command queue enabled host
> Fix transfer descriptor setting in cqhci_set_tran_desc() for 32-bit DMA
>
> Changes since V3:
> Adjusted ...blk_end_request...() for new block status codes
> Fixed CQHCI transaction descriptor for "no DCMD" case
>
> Changes since V2:
> Dropped patches that have been applied.
> Re-based
> Added "mmc: sdhci-pci: Add CQHCI support for Intel GLK"
>
> Changes since V1:
>
> "Share mmc request array between partitions" is dependent
> on changes in "Introduce queue semantics", so added that
> and block fixes:
>
> Added "Fix is_waiting_last_req set incorrectly"
> Added "Fix cmd error reset failure path"
> Added "Use local var for mqrq_cur"
> Added "Introduce queue semantics"
>
> Changes since RFC:
>
> Re-based on next.
> Added comment about command queue priority.
> Added some acks and reviews.
>
>
> Adrian Hunter (21):
> mmc: block: No need to export mmc_cleanup_queue()
> mmc: block: Simplify cleaning up the queue
> mmc: core: Make mmc_pre_req() and mmc_post_req() available
> mmc: block: Add error-handling comments
> mmc: core: Add parameter use_blk_mq
> mmc: block: Add blk-mq support
> mmc: block: Add CQE support
> mmc: sdhci-pci: Add CQHCI support for Intel GLK
> mmc: block: blk-mq: Add support for direct completion
> mmc: block: blk-mq: Separate card polling from recovery
> mmc: block: Make card_busy_detect() accumulate all response error bits
> mmc: block: blk-mq: Check error bits and save the exception bit when polling card busy
> mmc: block: Check the timeout correctly in card_busy_detect()
> mmc: block: Check for transfer state in card_busy_detect()
> mmc: block: Add timeout_clks when calculating timeout
> mmc: block: Reduce polling timeout from 10 minutes to 10 seconds
> mmc: block: blk-mq: Stop using legacy recovery
> mmc: mmc_test: Do not use mmc_start_areq() anymore
> mmc: core: Remove option not to use blk-mq
> mmc: block: Remove code no longer needed after the switch to blk-mq
> mmc: core: Remove code no longer needed after the switch to blk-mq
>
> Venkat Gopalakrishnan (1):
> mmc: cqhci: support for command queue enabled host
>
> drivers/mmc/core/block.c | 1383 +++++++++++++++++++++----------------
> drivers/mmc/core/block.h | 12 +-
> drivers/mmc/core/bus.c | 2 -
> drivers/mmc/core/core.c | 216 +-----
> drivers/mmc/core/core.h | 39 +-
> drivers/mmc/core/host.h | 6 +-
> drivers/mmc/core/mmc_test.c | 122 ++--
> drivers/mmc/core/queue.c | 504 +++++++++-----
> drivers/mmc/core/queue.h | 64 +-
> drivers/mmc/host/Kconfig | 14 +
> drivers/mmc/host/Makefile | 1 +
> drivers/mmc/host/cqhci.c | 1150 ++++++++++++++++++++++++++++++
> drivers/mmc/host/cqhci.h | 240 +++++++
> drivers/mmc/host/sdhci-pci-core.c | 155 ++++-
> include/linux/mmc/host.h | 5 +-
> 15 files changed, 2835 insertions(+), 1078 deletions(-)
> create mode 100644 drivers/mmc/host/cqhci.c
> create mode 100644 drivers/mmc/host/cqhci.h
>
>
> Regards
> Adrian