Re: [PATCH 4/8] Drivers: scsi: storvsc: Filter WRITE_SAME_16

From: Martin K. Petersen
Date: Wed Jul 16 2014 - 15:21:23 EST


>>>>> "Rob" == Elliott, Robert (Server Storage) <Elliott@xxxxxx> writes:

Rob> WRITE SAME with the UNMAP bit set to one (and a few other
Rob> conditions) guarantees that the data is zeroed out, while the UNMAP
Rob> command is just a hint. They're not fully interchangeable. Which
Rob> semantics are implied by REQ_DISCARD and these functions?

Voluntary deprovisioning of a block range.

Rob> One benefit of UNMAP is that UNMAP supports a list of discontiguous
Rob> LBA ranges, whereas WRITE SAME just supports one LBA range.
Rob> sd_setup_discard_cmnd is not taking advantage of this feature,
Rob> though.

The block layer can only describe one contiguous block range in a
request. My copy offload patches introduces the bi_special field that
allows us to attach additional information to an I/O. I have
experimented with doing that for discards to overcome the suck of DSM
TRIM. Splitting and merging requests in MD/DM gets much more cumbersome,
though.

Rob> Ideally, the block layer would merge multiple discards into one
Rob> UNMAP command if they're stuck on the request queue for a while,
Rob> like it merges adjacent reads and writes.

We support merging contiguous smaller discard requests.

We did discuss having a (separate) I/O scheduler to coalesce
discontiguous discard requests. However, the attempts at this turned out
to be pretty hideous.

It also wasn't evident that it was worth the hassle. While UNMAP allows
us to express large regions, DSM TRIM on the SATA side is limited to 32
MB per range. So in many cases we end up maxing out the payload capacity
even with a single contiguous range.

We expect LBP SCSI devices to queue commands. Being able to express
multiple ranges in one shot is less critical in that case.

--
Martin K. Petersen Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/