Re: devfs "instant replay" (Was: devfs - why not ?)

From: Johannes Erdfelt (jerdfelt@sventech.com)
Date: Fri Apr 14 2000 - 15:46:38 EST


On Fri, Apr 14, 2000, Ricky Beam <jfbeam@bluetopia.net> wrote:
>
> Because the umpire is wrong. I'm calling for an "instant replay."
>
> devfs fixes some things and introduces a bunch of new problems. We need
> a clear enumeration of what's wrong with the existing device name space
> so that a clean, clear, and proper solution can be agreed upon. A little
> PLANNING is a Good Thing. Spend some time think about the problem before
> laying down code to make the problem, on the whole, worse. This is very
> much the "jogging juggler" problem -- the solution is to learn to juggle
> standing on the edge of a cliff.
>
> There's a microscopic problem and a macroscopic problem. On the micro
> scale, devfs deleted the static array of pointers to tables of pointers
> to functions and maps the virtual file directly to the table of functions.
> And it does so dynamically. So, in the case of a serial port, /dev/ttyS0
> becomes /dev/ttys/0. The major and minor numbers no longer matter (or
> even need to exist) and the previous static lookup tables are no more.
> It's a faster method to reach the underlying serial.o working code. And
> it uses kernel space less wastefully.

Like hpa so nicely pointed out to me. Inode numbers and major/minor
numbers are essentially the same thing.

However, the big difference is you centralize the resolution of those
numbers into the devfs core, instead of requiring each and every
subsystem to determine which minor number goes to which device.

> BUT, on the macro scale, isn't not so clean. It doesn't scale very well.
> Previously, you'd be wasting a large number of on disk inodes. Now, you
> waste a medium number of virtual inodes in non-swappable kernel memory.
> You have the same hash overhead in directory lookups (ok, devfs is faster as
> all of it always in memory.) Now that everything is in core, it's all
> volatile and you have to resort to userland tricks (you actaully suggest
> TAR!), kernel event messaging, and overlay filesystems to get the any
> persistance or user control. [Can you say "kludge"? I knew that you could.]

Devices are unfortunately becoming more and more PnP and Hot Swap. Having
a persistant filesystem based backing store (like the current ext2 and
major/minor system) isn't useful anymore.

We need to dynamically assign permissions and ownerships to devices
anyway, so having that information saved in the filesystem buys us
nothing.

Having devfsd store these in a database as well as using arbitrary
algorithms, depending on the device, to assign permissions and ownership
is much more flexible and actually works with PnP and Hot Swap devices.

And the memory overhead is a non issue. We track devices in memory
anyway, adding another structure is a minimal loss for all of the
benefits it provides us. Plus it actually allows us to use PnP and Hot
Swap devices at all.

> Correct me if I'm wrong, but the multiple indirections to the operations
> structure only occurs with open() and the function addresses are attached
> to the file descriptor from that point on. So, what's the point in such an
> expensive optimization? So the directory lookup returns a pointer into kernel
> space instead of static major/minor numbers? There are far better ways to
> get rid of major/minor numbers.

I think this is what everyone is waiting for. This isn't the first time
I've heard someone say there is a better solution, and this isn't the
first time I haven't seen the solution to back it up.

> [Tell me again why we don't want major and minor numbers.]

a) There aren't enough and it's pain in the ass to increase the amount
   of them. It still hasn't happened. We still use a 8/8 split.
b) You centralize resolution of device name to device into the devfs
   core. Instead of requiring each any every subsystem to map minor
   numbers to device, you have one piece of code which does that. This
   becomes much more of a problem when you have a large major/minor space
   since the convenient array of 256 pointers doesn't work so well
   anymore

> Umm, would someone please enumerate _exactly_ what devfs brings to the table,
> what we're trying to "fix" and why?

devfs kills major/minor once and for all. It's a hinderance to Linux.

I'm open to other suggestions, as long as they solve all of the problems
people have with the existing major/minor pairs solution (the
traditional Unix solution).

However, no one has offered up an alternative, which solves all of the
existing problems with major/minor pairs and works with all of the up
and coming subsystems.

Until then, I'm still advocating devfs since it solves the problems I am
having with USB specifically, with minimal overhead and has an
implementation which works, right now.

JE

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Apr 15 2000 - 21:00:25 EST