Re: [Announce]: Target_Core_Mod/ConfigFS and LIO-Target v3.0 work

From: Vladislav Bolkhovitin
Date: Tue Dec 16 2008 - 13:45:14 EST


James Bottomley wrote:
On Sat, 2008-12-13 at 12:56 +0100, Bart Van Assche wrote:
On Sat, Dec 13, 2008 at 12:18 PM, Nicholas A. Bellinger
<nab@xxxxxxxxxxxxxxx> wrote:
Of course I fix bugs when people report them.
Things have changed then since the beginning of this year. As anyone
can see in the threads I referred to, you have done your best to deny
that the crashes and system hangs were caused by LIO, although I had
posted exact instructions on how to reproduce the bugs. Regarding
kernel integration and subsystem maintainership: one of the important
tasks of a maintainer is to verify whether reported bugs are
reproducible, and if so, to resolve them. I'm happy none of the
current kernel maintainers has the habitude of denying bug reports
that are 100% reproducible and which contain exact instructions about
how to reproduce the bug.

OK, All of you on this thread, why don't you take time out to step back
and think about the effects this descent into trench warfare is having
on your observers.

James,

I'm sorry you needed to intervene in such a manner. I don't want to continue that LIO vs SCST fight, but I see in your message some important misunderstandings about SCST, on which, I feel I need to reply to clean them up.

1. You're both saying the other side isn't production ready ...
it's not a stretch for the rest of us to take this at face
value ... about both of you.

I listed in http://lkml.org/lkml/2008/12/10/245 the exact things, why LIO is far from being production ready and can continue that list. In fact, if to call things their real names, LIO is an iSCSI target which in past few months in a hurry is being converted to a generic target engine and which has a lo-o-ong way to go to complete the conversion. I.e., in other words, LIO might be good as an iSCSI target, but as a generic iSCSI target engine at the moment it simply *does not exist* yet.

Relating to SCST being not production ready, can Nicholas Bellinger support his claims against SCST with something concrete? So far, everything he has written was empty words not supported by any real facts. For instance, he failed to describe for what all those "missed" in SCST features are needed.

2. This ideological opposition to features the other side
implements tells me that if it came to a choice, by going with
either one of you I'd get an incomplete feature set.

There's no ideological opposition between SCST and LIO. Both engines are built around basically the same ideology. The opposition is in completely different and non-technical area.

3. Making obvious partisans of your user base also tells me that if
I had to make a choice, whatever it was I'd piss off a large
number of people who'd be very vocal about it.

Unfortunately, being based on an Open Source product isn't something many people want to be proud of..

But here is the list of companies taken from scst-devel mailing list who are working on SCST based products and made contributions in the past half a year:

@storwize.com
@open-e.com
@enjellic.com

In the earlier time there were also contributions from @hp.com and @systemfabricworks.com.

Also, I've already mentioned Mellanox, who developed SRP target driver and now selling based on it product.

Also, there is a target driver development for Marvell SAS hardware by an anonymous company, see http://sourceforge.net/mailarchive/message.php?msg_id=e938503f0809260211r2d4ec37bt293c75c80960eadd%40mail.gmail.com

If you need more, I'll ask permissions from companies who already selling SCST based products (BTW, 2 of them - user space VTLs, which can be made on STGT, but those companies chose SCST).

It's worth to note here, that scst-devel mailing list has 134 subscribers. Many of them are from well known storage related companies. Unfortunately, other sf.net statistics permanently loose data, hence not trustworthy, so I can't refer to it.

So stop fighting ... you're not going to backstab your way to inclusion.

The only identified failing of STGT (and it's theoretical, not
demonstrated, although I can agree the theory looks correct) is that the
user space packet processing may cause performance problems on high
speed networks. We know from practical tests that these networks have
to be above 1Gbit because the results were identical for STGT and SCST
on a 1G network, so it's infiniband or 10Gbit ethernet.

I thought that SRP measurements in http://lkml.org/lkml/2008/12/10/245 are sufficient to remove all your doubts. If you don't object, I'll remind: there was a >50% improvement in IOPS on 4K writes (~150K vs ~100K), which relates to >200MB/s throughput increase, when, where possible, processing was moved from kernel threads to tasklets. For STGT any processing can't be moved to tasklets by design and context switches between user space threads are a bit heavier, than between kernel threads, + STGT has some syscall entry/exit overheads, hence for the same processing done in STGT, the difference would be even more.

Thus, those measurements give the low boundary estimation of the performance increase. Having such a huge increase on 4K block sizes is a big advantage for any latency bound applications, like databases.

What else should we do to convince you?

Also, what I can't understand, why you don't want to count the architectural advantages of SCST over STGT. Namely: overall simplicity, possibility to implement many impossible for STGT features, like complete pass-through and zero-copy cache IO. In fact, one such feature has already been implemented: zero-copy transmit in iSCSI target. From user space this is impossible, but for kernel I implemented it by very small and simple patch.

So, what it comes down to is that if we had a kernel side protocol
accelerator for STGT, the project would no longer suffer from this
theoretical failing. *Both* of you have such a thing embedded in your
respective submissions (all 74k LOC of them) so can't you just enhance
STGT with whichever one is better ... actually, if you'd both bury the
hatchet and work on the enhancement together taking the best of each
project, we'd have something that worked much better and a unified user
base and neither side would be able to claim sole credit ... just a
thought.

James, just think as if SCST in the current state is STGT in which all the possible enhancements are already incorporated. It simply has been cooking outside of the kernel for too long, so you didn't see the intermediate steps. I'm not joking. I'm absolutely serious. And it is true. Developing scst_user module I carefully studied STGT and scst_user has everything it could take from it.

When you ask us to improve STGT step by step and implement a kernel side protocol accelerator for it, you ask us to go back by 2+ years. For the kernel side acceleration STGT needs to move the SCSI target state machine and memory management into the kernel, which effectively means to convert it to SCST. What should I do to make it clear for you?

Also, current integration of STGT with Linux (initiator) SCSI subsystem should have a better design, I explained why in http://lkml.org/lkml/2008/12/10/245. SCSI initiator and target has almost nothing to share, so they should be separated.

I always open for any possible cooperation. Particularly, I'm always willing to make with SCST any necessary changes, which will lead to better target engine in Linux. But before doing any change I, as any sane engineer, need to have answers on several simple questions. Basically, there are 2 such questions:

1. For what the proposed action is needed? I.e., which real life task is it going to solve?

2. Why is the proposed change the best one among possible implementation alternatives?

If you simply take from http://scst.sourceforge.net/patches/scst_combined.patch the combined SCST patch, which has all 23 patches I submitted combined in a single file (BTW, it has 46K LOC, not 76K), then patch some 2.6.27 tree and spend a little time looking at it, you will soon find out that converting STGT to SCST is the worst possible alternative. Simply try to find out places, where STGT in-kernel core is better, than SCST core, or has a feature, which SCST core doesn't have. There is only one such feature: OSD support, i.e. bidirectional transfers, large CDBs, etc. It wasn't implemented in SCST so far, because there was no demand for it (hence, no way to test). But (1) this feature doesn't have any in-kernel user, so nobody will be affected if STGT moved to be user space only, and (2) there is nothing hard to add that feature to SCST, if there is such demand.

I have been closely following development of both STGT and LIO since their beginning, so my words based on close examination of their source code, not on my rejection to look at it. They both inferior to SCST in all main areas. I believe, there is no point to spend time improving kernel side of STGT. Better to put effort to better integrate user space part of STGT with scst_local SCST module as I described in http://lkml.org/lkml/2008/12/10/245. If you don't agree with me, can you answer on the question (2) above, please?

From everything I know SCST at the moment is the best open source SCSI target engine in the world and no other target engines, including Solaris's COMSTAR, can match it in functionality, performance and stability areas.

James, you offered by already *completed* work, where everything possible to improve STGT was already done, so why not simply accept it?

I'm an engineer, not a sales man, and there are no sales men in SCST team to advertise it. We believe that the source code, its quality, performance and feature completeness should speak theirself. It has been in Linux so far and we hope will be so in this case. Just let the code speak!

Sorry for taking your time by one more huge e-mail. I did my best to be as laconic as possible.

Thanks,
Vlad
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/