Re: journaling filesystem

Cloyce D. Spradling (
Thu, 14 May 1998 07:08:11 -0500

On Wed, May 13, 1998 at 08:05:07PM -0400, Albert D. Cahalan wrote:

: > On AIX, the journal is kept on a separate journal device, which is
: > common to all the mounted filesystems.

I wasn't going to pipe up before this, because it really wasn't
important. Now it's been misinterpreted, so it's important. :)

AIX has one JFS log per volume group, and that log is shared between all
of the filesystems in that volume group. So because we don't have a logical
volume manager (well, md is close in a couple of ways), under linux each
volume group is effectively one disk.

: This has many advantages.

: 1. does not harm compatibility with existing filesystems & kernels

We're talking about an entirely new filesystem type here, right? I don't
see any compatibility issues with this at all, in any case.

: 2. can put the log on a tape

This would be *baaaad*, and is also unnecessary. I certainly don't want
my disk write performance to be gated by the speed/reliability of my
tape drive. It's unnecessary because the logs aren't all that big. In
fact, I'm pretty sure that no JFS log (under AIX) ever has more than
one physical partition allocated to it.

JARGON WARNING (and small digression): AIX "physical partition" != "disk
partition under Linux". A physical partition is just a 4, 8, 16, 32M chunk
of disk space from the LVM. This size varies depending on the size of the
disks in the volume group, and the max size fs you want to be able to create;
most of my VGs (1 or 2 disks <= 4.3 GB each) have physical partition size
of 4MB, while (for example), my VG of 5 9.1GB disks has a physical partition
size 16MB. The jfslog is just another type of logical volume (okay, *now*
think partition here :).

: 3. can put the log in non-volatile RAM

Chances are, if you've got NVRAM it's probably on a RAID card or something
like a PrestoServ NFS enhancer, so all writes will probably go right to it.
I don't see any need to treat log writes specially in this regard.

: 4. can use one fast disk as the log for a dozen slow disks

That's true, but what happens when the disk with the log fails, but the FS
is actually okay? Do we throw out the baby with the bathwater? This
scenario is possible with AIX, but I've never been able to make it happen.

: 5. can run multiple logs for redundancy

This is probably the solution for #4.

: If you put the log on tape and started a second tape before the
: first one finished, you could keep changing tapes. You'd end up
: with a huge tape library that has a log of every filesystem change
: ever made. Add timestamps and user IDs for a solid audit trail.

Ugh. As much of an information packrat that I am, I think that'd be
a little too much info. I mean, what about merged writes, and stuff
like that? The log needs to be able to be written quickly, and should
probably have a fixed format.

BTW, for the record (to the best of my knowledge, that is), AIX's JFS log
only records changes in filesystem metadata, not user data. AIX's syncd
flushes stuff to disk every 60 secs. And while I suppose it's possible,
I've never (in the last 3.5 years) detected a case in which user data was
corrupted or lost. But then again, I don't work in support. :)

Oh yeah, I'm also not speaking for IBM.


- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to