Re: Extensions to HFS filesystem

Paul H. Hargrove (hargrove@sccm.stanford.edu)
Wed, 29 May 1996 22:49:54 -0700


Matthias Urlichs wrote:
>
> In linux.dev.kernel, article <31A455D7.794BDF32@sccm.stanford.edu>,
> "Paul H. Hargrove" <hargrove@sccm.stanford.edu> writes:
> >
> > The idea is sound except that under HFS there is no way to locate a file
> > by its CNID (the equivalent of an inode number). This means that the only
>
> Yes there is. The Mac equivalent of symlinks (alias files (perfectly normal
> files with the Alias bit set and an 'alis' resource inside)) do this.
>
> The way this works is that the OS stores a special record in the catalog
> B-tree when you create the alias. That record survives moving or renaming
> the file.

I am aware of the file threads that are created by the alias manager. They
only survive moving and renaming the file is moved/renamed with MacOS 7.0
or newer. System 6 doesn't deal with them, Executor doesn't deal with them,
I don't yet deal with them, Rob Leslie's hfsutils don't yet deal with them,
NeXT's HFS filesystem is pre-System 7 so probably doesn't deal with them...

There is also the fact that a file thread must be created for every file that
one needs to locate by CNID, though this isn't really all that bad.

I don't think it is worth keeping a list of "reverse links" for the purpose of
validating link count when the implementation of an inode tree (which helps to
solve a variety of problems) will address the problem. At boot time fsck.hfs
will check that the inode tree doesn't hold entries for files that no longer
exist at the same time that it checks that the extents tree doesn't have extents
for files that no longer exist. All of the link "pointers" that make up the
presumed link count for a "link target" will have entries in the inode tree. If
any of these need to be removed because the corresponding HFS file has gone away
then the link count of the target (which is located in the inode tree and CAN be
located by CNID) is reduced by one. (see below for more on what I intend for the
fsck.hfs program to do)

> > way to do the checking you describe is by brute force: checking every file
> > on the disk to see if it is the "pointer" file. If this is to be done once
> > for every boot, the work might as well be done by fsck.hfs which should be
> > examining every single file anyway.
> >
> fsck.hfs needs to read the catalog and extent Btrees (and their extents).
> That's not too much work. Scanning every single file isn't necessary any
> more than it is for fsck.ext2.

I didn't mean that fsck.hfs should examine the CONTENTS of every file on the
disk but that the CATALOG ENTRIES for every file should be examined to see if
they make sense. The number of physical blocks indicated as present in the file
should be checked against the number mapped by the catalog entry and the extents
tree. These extents should be checked to ensure that multiple files are not
using the same block, etc... The same sort of stuff that fsck.ext2 does. If an
Extended HFS filesystem is being checked then the extra data in the inode tree
should also be checked for consistency. This would include seeing that the link
count matches up, something that fsck.ext2 does. If a hidden "link target" has
a link count of zero (no corresponding "pointers") then it is removed, much as
fsck.ext2 will free a block that is marked as allocated but not used by any file.
(perhaps it would make sense to put the file into a lost+found as fsck.ext2 would
with an inode that is used but not referenced by any directory entry.)

> Note that doing an fsck-hfs which is able to correct problems is _hard_.
> I'd much rather have one that's read-only and says "Sorry. please take the
> disk to a Mac and run Norton Disk Doctor on it." when it finds a
> possibly-critical error.

There are some problems that are easier to fix than others such as blocks marked
used but not appearing in any file, extents for deleted files (like FAT lost
clusters), incorrect summary information, etc. All of these things can happen
when a filesystem is not unmounted cleanly to allow the updates to be flushed to
disk. More difficult problems, such as blocks belonging to multiple files or
multiple files with the same CNID are harder to deal with. For these it might
be best to tell the user to use another OS to repair the problem.

I don't plan on making a /usr/local/bin/wp that prints "Please reboot into DOS
to use WordPerfect." Likewise I don't want to give the user the impression that
Linux is inferior to MacOS by simply dismissing the filesystem repair task as
too difficult for a Linux program.

> --
> The advice you give a kid is considered dumb until he gets the same advice
> from another kid.
> -- Doug Larson
> --
> Matthias Urlichs

-- 
Paul H. Hargrove                   All material not otherwise attributed
hargrove@sccm.stanford.edu         is the opinion of the author or a typo.
P.S. If I sound cranky or incoherent above it is because I am rather short of
sleep right now.  Please don't take any thing I say personally.