Re: dcache questions

David C Niemi (niemi@tux.org)
Tue, 30 Dec 1997 13:24:43 -0500 (EST)


Linus Torvalds (torvalds@transmeta.com) wrote:
...
> No, as you found out that doesn't work. The VFS really depends on the
> dentries existing.
>
> The real solution is as far as I can tell:
> - don't allow the short aliases at all - only show LongFileName.exe,
> and never show or accept longfi~1.exe. This is the quick and
> dirty fix. Not recommended, and probably hard.

It's quite important to show (and look up) the ugly little short names, as
they need to be unique and sometimes that is the only path you happen to
know.

> - make the filesystem-specific hashing and name comparing functions
> compare the two names as equal - so that only one dentry exists for
> both of them. This is the "real" fix, but depending on how the name
> mangling works it may not be all that easy.

This really can't be reliably done via name-mangling, because the mangling
is not only fairly complex but also nondeterministic (the ~1 can be a ~2
depending on what else is in the directory, and when you get past ~9 you
have to subtract another letter from the name fragment at the front).

>From my past experience (implementing VFAT for Mtools) it seems a big
mistake to assume that the short name can be derived from the long name or
even compared with it for verification. (In fact VFAT includes a 1-byte
hash of the long file name with the short name for use in verification,
though it can of course yield false positives .4% of the time).

> The real fix may mean that you can only use the first six characters for
> the hash (and you have to mush together cases, obviously), and then
> making sure the lookup function compares the strange endings correctly
> too.
>
> So the VFS dentry layer does support these kinds of aliases, but it may
> be extremely non-trivial to actually get the compare functions working
> correctly. You may end up having to do pretty much the same as you do
> in the filename lookup function in the name compare code..

The only way to do the compare reliably is to go look up the file by the
name you know and see what other name is associated with *each path
component*.

Many (2^n, where n is the directory depth) dentries would be needed for a
given file to accomodate all of its possible manifestations. Surely we
would not want to automatically add all of them to the dentry cache; and if
only the all-long-name version is added, performance looking up anything
with a short-name component will be pitiful (i.e. completely uncached).

I would suggest that the all-long-name version of the path be used as the
"official" path which would be returned when doing the reverse
(dentry->path) lookup, even for lookups done on paths including ugly
short-name components. This means that the same path, if looked up in
reverse, might not have a dentry yet (unless the all-long-file-name dentry
is always created in addition). And d_add would have to support adding a
dentry for which the hash is taken of a different name from the one it is
adding. More complexity, but probably very important in the long run -- for
example, it is probably needed to support NTFS well in the future.

David
Niemi@tux.org 703-810-5538 Reston, Virginia, USA
"Down that path lies madness. On the other hand, the road to
hell is paved with melting snowballs." -- Larry Wall, 1992