Re: Block Layer File/FS Replication?

From: Malcolm Beattie (mbeattie@sable.ox.ac.uk)
Date: Thu Jul 27 2000 - 06:36:54 EST


Jeff McNeil writes:
> I'm trying to impliment file system replication at the block level. So
> far, from what I can see, Coda is my only way of going. While Intermezzo
> looks to do exactly what I want, it is not mature enough yet.
>
> I've been thinking about in the interim, writing a quick block level
> replication module. But I'll be damned if I can decide on the best way
> of doing so.

I don't know if it's what you're looking for, but I wrote bmigrate
which lets you migrate/sync live block devices to a remote system.
The difference between bmigrate and doing something like raid1 over
nbd is that bmigrate does the extra writes asynchronously. This has
the disadvantage that you're not guaranteed that the replicated device
is up to date at all times but it has the advantage that there's no
drop in performance at all.

What happens is that the kernel writes each block request (the
major, minor, sector and nr_sectors but not the data itself) to the
kernel side of a bufflink device (a fast lightweight kernel/userland
communication thing I wrote). A userland daemon, breqd, continuously
reads from the userland side of the bufflink device and updates a
(configurable per-device) bitmap file of dirty blocks on the device.
When you want to migrate the block device to another system, you run
bmigrate which reads from the bitmap file and for each dirty bit,
resets it, reads the dirty block and sends it to the remote side
(where a simple breceive daemon reads the data and writes it to the
remote block device). You can keep bmigrate running all the time if
you wish: it still doesn't harm performance too much since it
(intentionally) is allowed to get out of sync so it can catch up
later. bmigrate keeps making passes through the dirty bitmap as
necessary. When there are only a few dirty blocks left, you turn off
the software which is actually using the block device, wait a second
or two while bmigrate ships the last few blocks over to the remote
side and then restart the waiting software on the remote system which
now has a carbon copy of the old block device.

It's not suitable for all situations but it should make a useful
failover system for some services.

See
    http://users.ox.ac.uk/~mbeattie/linux-kernel.html

if you are interested in bufflink, reqlog and bmigrate.

--Malcolm

-- 
Malcolm Beattie <mbeattie@sable.ox.ac.uk>
Unix Systems Programmer
Oxford University Computing Services

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Jul 31 2000 - 21:00:23 EST