Re: [RFC PATCH 0/5] Enable use of Solid State Hybrid Drives

From: Jens Axboe
Date: Wed Oct 29 2014 - 17:11:11 EST


On 10/29/2014 02:14 PM, Dave Chinner wrote:
> On Wed, Oct 29, 2014 at 11:23:38AM -0700, Jason B. Akers wrote:
>> The following series enables the use of Solid State hybrid drives
>> ATA standard 3.2 defines the hybrid information feature, which provides a means for the host driver to provide hints to the SSHDs to guide what to place on the SSD/NAND portion and what to place on the magnetic media.
>>
>> This implementation allows user space applications to provide the cache hints to the kernel using the existing ionice syscall.
>>
>> An application can pass a priority number coding up bits 11, 12, and 15 of the ionice command to form a 3 bit field that encodes the following priorities:
>> OPRIO_ADV_NONE,
>> IOPRIO_ADV_EVICT, /* actively discard cached data */
>> IOPRIO_ADV_DONTNEED, /* caching this data has little value */
>> IOPRIO_ADV_NORMAL, /* best-effort cache priority (default) */
>> IOPRIO_ADV_RESERVED1, /* reserved for future use */
>> IOPRIO_ADV_RESERVED2,
>> IOPRIO_ADV_RESERVED3,
>> IOPRIO_ADV_WILLNEED, /* high temporal locality */
>>
>> For example the following commands from the user space will make dd IOs to be generated with a hint of IOPRIO_ADV_DONTNEED assuming the SSHD is /dev/sdc.
>>
>> ionice -c2 -n4096 dd if=/dev/zero of=/dev/sdc bs=1M count=1024
>> ionice -c2 -n4096 dd if=/dev/sdc of=/dev/null bs=1M count=1024
>
> This looks to be the wrong way to implement per-IO priority
> information.
>
> How does a filesystem make use of this to make sure it's
> metadata ends up with IOPRIO_ADV_WILLNEED to store frequently
> accessed metadata in flash. Conversely, journal writes need to
> be issued with IOPRIO_ADV_DONTNEED so they don't unneceessarily
> consume flash space as they are never-read IOs...

Not disagreeing that loading more into the io priority fields is a
bit... icky. I see why it's done, though, it requires the least amount
of plumbing.

As for the fs accessing this, the io nice fields are readily exposed
through the ->bi_rw setting. So while the above example uses ionice to
set a task io priority (that a bio will then inherit), nothing prevents
you from passing it in directly from the kernel.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/