Re: clustered MD

From: Goldwyn Rodrigues
Date: Tue Jun 09 2015 - 23:33:24 EST




On 06/09/2015 03:30 PM, David Teigland wrote:
On Tue, Jun 09, 2015 at 03:08:11PM -0500, Goldwyn Rodrigues wrote:
Hi David,

On 06/09/2015 02:45 PM, David Teigland wrote:
On Tue, Jun 09, 2015 at 02:26:25PM -0500, Goldwyn Rodrigues wrote:
On 06/09/2015 01:22 PM, David Teigland wrote:
I've just noticed the existence of clustered MD for the first time.
It is a major new user of the dlm, and I have some doubts about it.
When did this appear on the mailing list for review?

It first appeared in December, 2014 on the RAID mailing list.
http://marc.info/?l=linux-raid&m=141891941330336&w=2

I don't read that mailing list. Searching my archives of linux-kernel, it
has never been mentioned. I can't even find an email for the md pull
request that included it.

Is this what you are looking for?
http://marc.info/?l=linux-kernel&m=142976971510061&w=2

Yes, I guess gmail lost it, or put it in spam.

- "experimental" code for managing md/raid1 across a cluster using
DLM. Code is not ready for general use and triggers a WARNING if
used. However it is looking good and mostly done and having in
mainline will help co-ordinate development.

That falls far short of the bar for adding it to the kernel. It not only
needs to work, it needs to be reviewed and justified, usually by showing

Why do you say it does not work?

It's just my abbreviation of that summary paragraph.

It did go through it's round of reviews on the RAID mailing list. I
understand that you missed it because you are not subscribed to the raid
mailing list.

I will look for that.

some real world utility to warrant the potential maintenance effort.

We do have a valid real world utility. It is to provide
high-availability of RAID1 storage over the cluster. The
distributed locking is required only during cases of error and
superblock updates and is not required during normal operations,
which makes it fast enough for usual case scenarios.

That's the theory, how much evidence do you have of that in practice?

We wanted to develop a solution which is lock free (or atleast minimum) for the most common/frequent usage scenario. Also, we compared it with iozone on top of ocfs2 to find that it is very close to local device performance numbers. we compared it with cLVM mirroring to find it better as well. However, in the future we would want to use it with with other RAID (10?) scenarios which is missing now.


What are the doubts you have about it?

Before I begin reviewing the implementation, I'd like to better understand
what it is about the existing raid1 that doesn't work correctly for what
you'd like to do with it, i.e. I don't know what the problem is.


David Lang has already responded: The idea is to use a RAID device (currently only level 1 mirroring is supported) with multiple nodes of the cluster.

Here is a description on how to use it:
http://marc.info/?l=linux-raid&m=141935561418770&w=2

--
Goldwyn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/