Re: [RFC PATCH] vhost-blk: An in-kernel accelerator for virtio-blk

From: Liu Yuan
Date: Mon Aug 01 2011 - 02:26:19 EST


On 07/28/2011 11:22 PM, Michael S. Tsirkin wrote:

It would be nicer to reuse the worker infrastructure
from vhost.c. In particular this one ignores cgroups that
the owner belongs to if any.
Does this one do anything that vhost.c doesn't?


The main idea I use a separated thread to handling completion is to decouple the request handling
and the request completion signalling. This might allow better scalability in a IO intensive scenario,
since I noted that virtio driver would allow sumbit as much as 128 requests in one go.

Hmm, I have tried to make signalling thread into a function that is called as a vhost-worker's work.
I didn't see regression in my laptop with iodepth equalling 1,2,3. But requests handling and completion signalling may produce race in a high requests submitting rate. Anyway, I'll adopt it to work as a vhost
worker function in v2.


+ switch (ioctl) {
+ case VHOST_NET_SET_BACKEND:
+ if(copy_from_user(&backend, (void __user *)arg, sizeof backend))
+ break;
+ ret = blk_set_backend(blk,&backend);
+ break;
Please create your own ioctl for this one.


How about change VHOST_NET_SET_BACKEND into VHOST_SET_BACKEND?


static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
struct iocb *iocb, bool compat)
diff --git a/fs/eventfd.c b/fs/eventfd.c
index d9a5917..6343bc9 100644
--- a/fs/eventfd.c
+++ b/fs/eventfd.c
@@ -406,6 +406,7 @@ struct file *eventfd_file_create(unsigned int count, int flags)

return file;
}
+EXPORT_SYMBOL_GPL(eventfd_file_create);
You can avoid the need for this export if you pass
the eventfd in from userspace.


Since eventfd used by completion code is internal and hiding it from hw/vhost_blk.c would simplify
the configuration, I think this exporting is necessary and can get rid of unnecessary FD management
in vhost-blk.c.


SYSCALL_DEFINE2(eventfd2, unsigned int, count, int, flags)
{
diff --git a/include/linux/aio.h b/include/linux/aio.h
index 7a8db41..d63bc04 100644
--- a/include/linux/aio.h
+++ b/include/linux/aio.h
@@ -214,6 +214,37 @@ struct mm_struct;
extern void exit_aio(struct mm_struct *mm);
extern long do_io_submit(aio_context_t ctx_id, long nr,
struct iocb __user *__user *iocbpp, bool compat);
+extern void __put_ioctx(struct kioctx *ctx);
+extern struct kioctx *ioctx_alloc(unsigned nr_events);
+extern struct kiocb *aio_get_req(struct kioctx *ctx);
+extern ssize_t aio_run_iocb(struct kiocb *iocb);
+extern int __aio_run_iocbs(struct kioctx *ctx);
+extern int read_events(struct kioctx *ctx,
+ long min_nr, long nr,
+ struct io_event __user *event,
+ struct timespec __user *timeout);
+extern void io_destroy(struct kioctx *ioctx);
+extern ssize_t aio_setup_iocb(struct kiocb *kiocb, bool compat);
+extern void __put_ioctx(struct kioctx *ctx);
+
+static inline void get_ioctx(struct kioctx *kioctx)
+{
+ BUG_ON(atomic_read(&kioctx->users)<= 0);
+ atomic_inc(&kioctx->users);
+}
+
+static inline int try_get_ioctx(struct kioctx *kioctx)
+{
+ return atomic_inc_not_zero(&kioctx->users);
+}
+
+static inline void put_ioctx(struct kioctx *kioctx)
+{
+ BUG_ON(atomic_read(&kioctx->users)<= 0);
+ if (unlikely(atomic_dec_and_test(&kioctx->users)))
+ __put_ioctx(kioctx);
+}
+
#else
static inline ssize_t wait_on_sync_kiocb(struct kiocb *iocb) { return 0; }
static inline int aio_put_req(struct kiocb *iocb) { return 0; }
--
1.7.5.1

Other comments will be addressed in V2. Thanks

Yuan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/