Re: [PATCH 1b/7] dlm: core locking

From: Daniel Phillips
Date: Tue Apr 26 2005 - 23:22:32 EST

Next message: Benjamin Herrenschmidt: "Re: pci-sysfs resource mmap broken"
Previous message: Evgeniy Polyakov: "Re: [1/1] connector/CBUS: new messaging subsystem. Revision numbernext."
In reply to: Steven Dake: "Re: [PATCH 1b/7] dlm: core locking"
Next in thread: David Teigland: "Re: [PATCH 1b/7] dlm: core locking"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tuesday 26 April 2005 21:50, Steven Dake wrote:
> Daniel
>
> The issue is virtual synchrony for dlm, not virtual synchrony for
> synchronization of your block device work.

Honestly, at 250 uSec/message you don't have a hope of convincing anybody to
use virtual synchrony in a dlm for anything other than slow path recovery.
And even the performance requirements for recovery are more stringent than
you would think.

> However, since it appears you would like to address it...

Actually, I just pointed out a practical opportunity for you to demonstrate
the advantages of virtual synchrony in this context.

> While we could talk about your
> particular design, virtual synchrony can synchronize at near wire speed
> for larger messages with encryption.

Maybe so, but the short message case is vastly more important. And you would
actually have to work at it to fall much below wire speed on long messages.

> In this case, there would be
> benefit to synchronizing larger blocks of data at once, instead of
> individual (512 byte?) blocks of messages.

That is exactly what I do, of course. More specifically, I let Linux do it
for me. My average synchronization message size is 20 bytes including
header. I also have very low latency because the wrapper on the raw tcp is
thin. Maybe I should strip away the last layer and ride right on the
ethernet transport, to win another 10%. Not that I really need it. And not
that not really needing it means I am prepared to give any up!

> I really doubt you would see
> any performance improvement in the pure synchronization, although you
> would get encryption and authentication of messages, and you would not
> have those pesky race conditions.

What race conditions? If you can spot any unhandled races, please let me know
as soon as possible.

> So, there would be security, at little cost to performance.

Security has not yet become an issue for shared-disk filesystems, since any
node with root basically owns the shared disk. Hopefully, one day we will do
something about that[1]. Even then chances are, only the packet headers need
encrypting (or signing, more likely).

> Now on the topic of race conditions. Any system with race conditions
> will eventually (in some short interval of operation) fail. Are you
> suggesting that data is entrusted to a system with race conditions that
> result in failures in short intervals?

Surely you mean unhandled races? Please show me any in my code, and I will be
eternally indebted.

> But now back to the original point:
>
> The context we are talking about here is dlm. In the dlm case, I find
> it highly unlikely the posted patches can process 100,000 locks per
> second (between processors) as your claim would seem to suggest.

Read the code and prepare to be amazed ;-)

> As the benchmarks have not been posted, its hard to see.

I posted some results for ddraid earlier, here and on linux-cluster. There
will be charts and graphs as I get time. Anybody who is impatient can grab
the code and make charts for themselves. (But beware: the cluster snapshot
has a nasty bottleneck in the server, due to no parallel disk IO. The
cluster raid is ok.)

> If the benchmarks
> are beyond the 15,000 locks per second that could easily be processed with
> virtual synchrony with an average speed processor, then please, correct
> me.

Consider yourself corrected.

> Post dlm benchmarks Daniel...

I'm dancing as fast as I can ;-)

[1] I already lost a rather expensive bet re cluster security becoming a
checkbox item within the past year.

Regards,

Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Benjamin Herrenschmidt: "Re: pci-sysfs resource mmap broken"
Previous message: Evgeniy Polyakov: "Re: [1/1] connector/CBUS: new messaging subsystem. Revision numbernext."
In reply to: Steven Dake: "Re: [PATCH 1b/7] dlm: core locking"
Next in thread: David Teigland: "Re: [PATCH 1b/7] dlm: core locking"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]