Re: [PATCH] md: Call blk_queue_flush() to establish flush/fuasupport

From: Neil Brown
Date: Mon Nov 22 2010 - 18:50:19 EST


On Mon, 22 Nov 2010 15:22:08 -0800
"Darrick J. Wong" <djwong@xxxxxxxxxx> wrote:

> Before 2.6.37, the md layer had a mechanism for catching I/Os with the barrier
> flag set, and translating the barrier into barriers for all the underlying
> devices. With 2.6.37, I/O barriers have become plain old flushes, and the md
> code was updated to reflect this. However, one piece was left out -- the md
> layer does not tell the block layer that it supports flushes or FUA access at
> all, which results in md silently dropping flush requests.
>
> Since the support already seems there, just add this one piece of bookkeeping
> to restore the ability to flush writes through md.

I would rather just unconditionally call
blk_queue_flush(mddev->queue, REQ_FLUSH | REQ_FUA);

I don't think there is much to be gained by trying to track exactly what the
underlying devices support, and as the devices can change, that is racy
anyway.

Thoughts?

NeilBrown




>
> Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx>
> ---
>
> drivers/md/md.c | 25 ++++++++++++++++++++++++-
> 1 files changed, 24 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 324a366..a52d7be 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -356,6 +356,21 @@ EXPORT_SYMBOL(mddev_congested);
> /*
> * Generic flush handling for md
> */
> +static void evaluate_flush_capability(mddev_t *mddev)
> +{
> + mdk_rdev_t *rdev;
> + unsigned int flush = REQ_FLUSH | REQ_FUA;
> +
> + rcu_read_lock();
> + list_for_each_entry_rcu(rdev, &mddev->disks, same_set) {
> + if (rdev->raid_disk < 0)
> + continue;
> + flush &= rdev->bdev->bd_disk->queue->flush_flags;
> + }
> + rcu_read_unlock();
> +
> + blk_queue_flush(mddev->queue, flush);
> +}
>
> static void md_end_flush(struct bio *bio, int err)
> {
> @@ -1885,6 +1900,8 @@ static int bind_rdev_to_array(mdk_rdev_t * rdev, mddev_t * mddev)
> /* May as well allow recovery to be retried once */
> mddev->recovery_disabled = 0;
>
> + evaluate_flush_capability(mddev);
> +
> return 0;
>
> fail:
> @@ -1903,17 +1920,23 @@ static void md_delayed_delete(struct work_struct *ws)
> static void unbind_rdev_from_array(mdk_rdev_t * rdev)
> {
> char b[BDEVNAME_SIZE];
> + mddev_t *mddev;
> +
> if (!rdev->mddev) {
> MD_BUG();
> return;
> }
> - bd_release_from_disk(rdev->bdev, rdev->mddev->gendisk);
> + mddev = rdev->mddev;
> + bd_release_from_disk(rdev->bdev, mddev->gendisk);
> list_del_rcu(&rdev->same_set);
> printk(KERN_INFO "md: unbind<%s>\n", bdevname(rdev->bdev,b));
> rdev->mddev = NULL;
> sysfs_remove_link(&rdev->kobj, "block");
> sysfs_put(rdev->sysfs_state);
> rdev->sysfs_state = NULL;
> +
> + evaluate_flush_capability(mddev);
> +
> /* We need to delay this, otherwise we can deadlock when
> * writing to 'remove' to "dev/state". We also need
> * to delay it due to rcu usage.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/