Re: hfs support for blocksize != 512

From: Alexander Viro (viro@math.psu.edu)
Date: Tue Aug 29 2000 - 23:04:40 EST


On Wed, 30 Aug 2000, Roman Zippel wrote:

> > > hfs. For example reading from a file might require a read from a btree
> > > file (extent file), with what another file write can be busy with (e.g.
> > > reordering the btree nodes).
> >
> > And?
>
> The point is: the thing I like about Linux is its simple interfaces, it's
> the basic idea of unix - keep it simple. That is true for most parts - the
> basic idea is simple and the real complexity is hidden behind it. But
> that's currently not true for vfs interface, a fs maintainer has to fight
> right now with fscking complex vfs interface and with a possible fscking

Yes? And it will become simpler if you will put each and every locking
scheme into the API?

Look: we have Hans with his trees-all-over-the-place + journal. He has a
very legitimate need to protect the internal data structures of Reiserfs
and do it without changing the VFS<->reiserfs interaction whenever he
decides to change purely internal structures.

We have ext2 with indirect blocks, inode bitmaps and block bitmaps, one
per cylinder group + counters in each cylinder group. Should VFS know
about the internal locking rules? Should it be aware of the fact that
inodes foo and bar belong to the same cylinder group and if we remove them
we will need to protect the bitmap for a while?

We have FAT32 where we've got a nasty allocation data with rather
interesting locking rules. Should it be protected by VFS? If it should -
well, I have bad news for you: write() on a file will lock the whole
filesystem until write() completes. Don't like it for every fs? Tough, it
will mean that VFS will not protect the thing and fs will have to do it
itself.

We have AFFS with totally fscked directory structures. Do you propose to
make unlink() block all directory operations on the whole fs? No? Too
bad, because only AFFS knows enough to protect its data structures without
_that_ locking. Sorry, the only rule that would not require the knowledge
of layout and would be strong enough to protect is "no directory access
while unlink() is in progress". Yup, on the whole fs. Hardly acceptable
even for one filesystem, but try to impose that on everyone and see how
long you will survive. JPEGs of the murder scene would be appreciated,
BTW.

We have HFS with the data structures of its own. You want locking in VFS
that would protect the things VFS doesn't know about and has no business
to meddle with? Fine, post the locking rules.

It's insane - protection of purely internal data structures belongs to the
module that knows about them. Generic stuff can, should be and _is_
protected. Private one _can't_ be protected without either horribly
crippled system (see above) or putting the knowledge of each data
structure into the generic layer. And the latter will be on the author of
filesystem anyway, because only he knows what rules he need.

Please, propose your magical locking scheme that will protect everything
on every fs. And let maintainers of filesystems tell you whether it is
sufficient. Then check what's left after that locking - e.g. can two
processes access the same fs at the same time or not?

If you are complaining about the fact that maintaining complex data
structures in multithreaded program (which kernel is) may be, well,
complex - welcome to reality. It had been that way since the very
beginning on _all_ MT projects, Linux included. You have complex private
data - you may be in for pain protecting yourself from races. Protection
of the public structures is there, so life became easier than it used to
be back in 2.0/2.1/2.2 days.

Making VFS single-threaded will not fly. If you can show simpler MT one -
do it and a lot of people will be extremely grateful. 4.4BSD and SunOS
ones are more complex and make the life harder for filesystem writers.
Check yourself. OSF/1 is _much_ more complex. Hell knows what NT has, but
filesystem glue there looks absolutely horrible - compared to them we are
angels in that respect. v7 was simpler, sure enough. Without mmap(),
rename() and truncate() _and_ with only one fs type - why not? Too bad
that it was racey as hell... Plan 9 is nice and easy. Without mmap(),
without link(), without truncate(), without cross-directory rename() and
without support of crazy abortions from hell a-la AFFS. 2.0 and 2.2 are
_way_ more complex, just compare filesystem code size in 2.4 with them and
you will see. And yes, races in question are not new. I can reproduce them
on 2.0.9 box. Single-processor one - nothing fancy and SMP-related.

If you have a way to simplify VFS and/or filesystems - by all means,
post it on fsdevel/l-k. Just tell what locking warranties you provide.
Current ones are documented in the tree, so it will be very easy to
compare. I'm not saying that they are ideal (check the documentation in
question - I'm saying the opposite in quite a few cases). They _can_ be
made better. But if you are saying that you know how to protect purely
internal data structures without losing MT VFS - sorry, I will believe it
when I see it. Go ahead, prove me wrong.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Aug 31 2000 - 21:00:25 EST