Re: [PATCH 2/7] Add support for per-file stream ID

From: Jens Axboe
Date: Sat Apr 18 2015 - 15:51:26 EST


On 04/09/2015 05:22 PM, Andreas Dilger wrote:
On Mar 25, 2015, at 9:07 AM, Jens Axboe <axboe@xxxxxx> wrote:

Writing on flash devices can be much more efficient, if we can
inform the device what kind of data can be grouped together. If
the device is able to group data together with similar lifetimes,
then it can be more efficient in garbage collection. This, in turn,
leads to lower write amplification, which is a win on both device
wear and performance.

Add a new fadvise hint, POSIX_FADV_STREAMID, which sets the file
and inode streamid. The file streamid is used if we have the file
available at the time of the write (O_DIRECT), we use the inode
streamid if not (buffered writeback). The fadvise hint uses the
'offset' field to specify a stream ID.

Signed-off-by: Jens Axboe <axboe@xxxxxx>
---
fs/inode.c | 1 +
fs/open.c | 1 +
include/linux/fs.h | 23 +++++++++++++++++++++++
include/uapi/linux/fadvise.h | 2 ++
mm/fadvise.c | 17 +++++++++++++++++
5 files changed, 44 insertions(+)

diff --git a/fs/inode.c b/fs/inode.c
index f00b16f45507..41885322ba64 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -149,6 +149,7 @@ int inode_init_always(struct super_block *sb, struct inode *inode)
inode->i_blocks = 0;
inode->i_bytes = 0;
inode->i_generation = 0;
+ inode->i_streamid = 0;
inode->i_pipe = NULL;
inode->i_bdev = NULL;
inode->i_cdev = NULL;
diff --git a/fs/open.c b/fs/open.c
index 33f9cbf2610b..4a9b2be1a674 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -743,6 +743,7 @@ static int do_dentry_open(struct file *f,
f->f_flags &= ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC);

file_ra_state_init(&f->f_ra, f->f_mapping->host->i_mapping);
+ f->f_streamid = 0;

return 0;

diff --git a/include/linux/fs.h b/include/linux/fs.h
index b4d71b5e1ff2..43dde70c1d0d 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -631,6 +631,7 @@ struct inode {
};

__u32 i_generation;
+ unsigned int i_streamid;

Since there are only 8 bits of streamid being passed from userspace,
is it possible to declare this as a char and pack it into a hole so
that it doesn't increase the inode size for a functionality that most
people won't be using? Maybe after i_bytes? That could be increased
to unsigned short if needed without increasing the size of the inode.

In the next version, I've retained the int, but ensured that they pack better in both struct file and struct inode. Basically fill some existing holes.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/