Re: Integration of SCST in the mainstream Linux kernel

From: Nicholas A. Bellinger
Date: Thu Jan 31 2008 - 12:21:40 EST


On Thu, 2008-01-31 at 18:50 +0300, Vladislav Bolkhovitin wrote:
> Bart Van Assche wrote:
> > On Jan 31, 2008 2:25 PM, Nicholas A. Bellinger <nab@xxxxxxxxxxxxxxx> wrote:
> >
> >>Since this particular code is located in a non-data path critical
> >>section, the kernel vs. user discussion is a wash. If we are talking
> >>about data path, yes, the relevance of DD tests in kernel designs are
> >>suspect :p. For those IB testers who are interested, perhaps having a
> >>look with disktest from the Linux Test Project would give a better
> >>comparision between the two implementations on a RDMA capable fabric
> >>like IB for best case performance. I think everyone is interested in
> >>seeing just how much data path overhead exists between userspace and
> >>kernel space in typical and heavy workloads, if if this overhead can be
> >>minimized to make userspace a better option for some of this very
> >>complex code.
> >
> > I can run disktest on the same setups I ran dd on. This will take some
> > time however.
>
> Disktest was already referenced in the beginning of the performance
> comparison thread, but its results are not very interesting if we are
> going to find out, which implementation is more effective, because in
> the modes, in which usually people run this utility, it produces latency
> insensitive workload (multiple threads working in parallel). So, such
> multithreaded disktests results will be different between STGT and SCST
> only if STGT's implementation will get target CPU bound. If CPU on the
> target is powerful enough, even extra busy loops in the STGT or SCST hot
> path code will change nothing.
>

I think the really interesting numbers are the difference for bulk I/O
between kernel and userspace on both traditional iSCSI and the RDMA
enabled flavours. I have not been able to determine anything earth
shattering from the current run of kernel vs. userspace tests, nor which
method of implementation for iSER, SRP, and generic Storage Engine are
'more effective' for that case. Performance and latency to real storage
would make alot more sense for the kernel vs. user case. Also workloads
against software LVM and Linux MD block devices would be of interest as
these would be some of the more typical deployments that would be in the
field, and is what Linux-iSCSI.org uses for our production cluster
storage today.

Having implemented my own iSCSI and SCSI Target mode Storage Engine
leads me to believe that putting logic in userspace is probably a good
idea in the longterm. If this means putting the entire data IO path
into userspace for Linux/iSCSI, then there needs to be a good reason why
this will not not scale to multi-port 10 Gb/sec engines in traditional
and RDMA mode if we need to take this codepath back into the kernel.
The end goal is to have the most polished and complete storage engine
and iSCSI stacks designs go upstream, which is something I think we can
all agree on.

Also, with STGT being a pretty new design which has not undergone alot
of optimization, perhaps profiling both pieces of code against similar
tests would give us a better idea of where userspace bottlenecks reside.
Also, the overhead involved with traditional iSCSI for bulk IO from
kernel / userspace would also be a key concern for a much larger set of
users, as iSER and SRP on IB is a pretty small userbase and will
probably remain small for the near future.

> Additionally, multithreaded disktest over RAM disk is a good example of
> a synthetic benchmark, which has almost no relation with real life
> workloads. But people like it, because it produces nice looking results.
>

Yes, people like to claim their stacks are the fastest with RAM disk
benchmarks. But hooking up their fast network silicon to existing
storage hardware and OS storage subsystems and software is where the
real game is..

> Actually, I don't know what kind of conclusions it is possible to make
> from disktest's results (maybe only how throughput gets bigger or slower
> with increasing number of threads?), it's a good stress test tool,
> but
> not more.
>

Being able to have a best case baseline with disktest for kernel vs.
user would be of interest for both transport protocol and SCSI Target
mode Storage Engine profiling. The first run of tests looked pretty
bandwith oriented, so disktest works well to determine maximum bandwith.
Disktest also is nice for getting reads from cache on hardware RAID
controllers because disktest only generates requests with LBAs from 0 ->
disktest BLOCKSIZE.

--nab

> > Disktest is new to me -- any hints with regard to suitable
> > combinations of command line parameters are welcome. The most recent
> > version I could find on http://ltp.sourceforge.net/ is ltp-20071231.
> >
> > Bart Van Assche.
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/