Re: Kernel interface changes (was Re: cdrecord problems on recent Linux versions)

Derek Atkins (warlord@MIT.EDU)
03 Feb 1999 17:51:58 -0500


Without quoting Monty, I was to detail how I've had similar problems
over time: maintaining Linux-AFS. I also have some suggestions for
ways to fix this problem later in this message.

<BACKGROUND>
For those who do not know, AFS is the Andrew File System, a
distributed file system developed at CMU, which currently MIT uses for
all Unix file service. (As a side note, CMU's Coda filesystem is a
spin-off from AFS from a decade ago -- today they look nothing alike,
but the major design is similar).

Back in 1994 I had the inenviable job of porting AFS to Linux. This
entailed working with Linus and the Linux 1.1 kernel tree to get
enough support into the core Linux kernel in order to load the AFS
module. It was a lot of fun, and I think I had an impact on some
parts of the Linux kernel, in particular parts of the VFS interface.
</BACKGROUND>

The major problem with AFS is that it is copyrighted, commercial
software. This means that the sources are only available under strict
license agreement and cannot be distributed. MIT happens to be a
source lincensee, so I was able to implement the port. However, the
_ONLY_ way to distribute AFS is via binary objects. Fine, not a
problem -- Linux has this great loadable kernel module system. No
Problem, right? I just build a module, distribute it, and everyone
can use it, right? WRONG.

Oh, it was great in the early days. Remember the 1.2 stable release?
Linux 1.2 was completely stable; the modules I built against 1.2.2
still worked against 1.2.13. Ahh, the good old days.

Next came the 1.3 kernels. I'm not going to complain about the Linux
1.3 kernels -- I understand that they are development kernels. Things
are moving, things are changing. No problem.

Then Linux 2.0 was released. Unfortunately Linux 2.0 was not nearly
as "stable" as Linux 1.2. During the tenure of the 2.0 kernels, there
were many changes during the 'patch-level' releases that caused binary
incompatibilities. I remember one change, I think it was somewhere
around Linux 2.0.7, where the incompatible change was reordering the
members of a structure! Think about it: binary incompatibilities due
to REORDERING A STRUCTURE! Why was this done, you may ask? I did. I
was told me it was a cache speed improvement. Gee, thanks. Break all
binary compatibility for some potential small speed improvements? In
a supposedly stable release? Why couldn't this wait?

Put yourself in my place. What a nightmare! I had to Recompile,
Test, Verify, and SHIP a new kernel module for practically EVERY
KERNEL RELEASE during a supposedly 'stable' release cycle. Yea, so
what? I shouldn't be shipping a binary module, right? That wasn't my
choice. Sure, I could go with another vendor, right? No, there was
no other vendor of AFS, so I was forced to use their product to access
my files (as does everyone else in the world who uses AFS, and let me
tell you there are a LOT of them!) So, what about writing my own free
replacement? Umm, well, if you've never looked at AFS sources....
Let's just say that at the time it wasn't practical (things have
changed now, five years later) -- and besides, I was already tainted
(oh right, that!)

So, what about other others? How about choosing one kernel? Well,
Slackware was using kernel X, RedHat was using kernel Y, Debian was
using Kernel Z.... And, of course, all of them were mutually
incompatible as far as my loadable kernel module. So, which one do I
use/support?

Don't even get me started down the road of libc, genksyms, or other
user-space incompatibilities which caused other problems along the
way!

So, here I sit, frustrated with Linux and it's empty promise of a
stable kernel release.

At some level I'm still a strong fan of Linux, but I must admit that
my attention is waning. I'm getting more and more frustrated that
people are willing to make 'tweaks' with no regards for the
consequences of their changes. I'm getting frustrated at kernel
developers who don't understand the term 'stable release' (most seem
to think it only means 'the kernel doesn't crash'). I'm getting
frustrated with all the support questions I receive because someone
built with patch A and configuration B which is incompatible with
patch B and configuration A.

Like it or not, Linux is now a "commercial" OS. What I mean is that
Linux has been accepted by the commercial industry, and people in that
industry demand stability. Companies are going to want to build their
applications for Linux, but they will absolutely refuse to join the
'build-of-the-week' club. They just want to build their product, test
it, ship it, and support it -- and not have to repear the cycle for
another 6 months.

They demand stability. When Linux is in a stable release, it MUST
remain truly stable. No longer can we change interfaces at a whim.
No longer can we switch the order of structure members for 'speed
improvements.' No longer can we be lax about our release engineering.
During a stable release, the only changes that should go into the
kernel are bug fixes (e.g., if you do foo the kernel crashes) or
potentially security fixes (e.g., if you do foo, the user gains root
privs).

Other changes, Speed enhancements, additional features, new support,
all that should, no _must_, happen in next development cycle and wait
for the next stable release, regardless of whose changes they are.

But the next stable release is two years away?!? Well, maybe that
should change. Maybe Linux should choose to operate with a real
Software Development Process? Maybe features should only be
integrated when they have been finished and tested? If a feature is
only half complete, so be it -- it can wait (if it were more
important, it would have been done ahead of other features, right?)
Maybe we should try to get a stable release out every 6 months instead
of every 24? Maybe we should have real code-freezes, where the only
changes that go in affect the runability of the kernel (does it
crash)? Maybe we should listen to our customers (yes, Linux has
customers!) and let Linux go.

The time has come for Linux to grow up. The time has come for Linux
to leave the playground and join the office crowd. It is certainly a
sad day for some, but I think every single person on this list wan't
Linux to succeed. I know I want Linux to be a real thorn to the Evil
Empire in the Pacific Northwest. But to do that Linux must grow up;
Linux must act like a commercial OS; Linux must be STABLE.

Anyways, I've taken enough bandwidth as it is. I just want you to sit
back for a moment and think about where you want Linux to go? Do you
want it to remain a hacker-only system, or do you want Linux to be the
system-of-choice for everyone, competing with The Big Boys? Do you
want more software support for Linux, or do you think your grandmother
would be satisfied using 'ed' to compose email (ok, maybe emacs or vi
;)? Do you want more people demanding support for Linux? Do you want
to see Linux accepted by more people in more industries?

As I'm sure you are aware, people will be turned off by things that
don't work. Just think about people's frustrations with M$ Windows!
Currently, this model of changing kernel interfaces within supposedly
stable releases DOESN'T WORK. It must change, or Linux will
inevitably lose.

I hope you can see the necessity of true stability.

Thanks,

-derek

PS: If anyone is interested in Software Development Processes, I'd
be happy to supply some input.

-- 
       Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
       Member, MIT Student Information Processing Board  (SIPB)
       URL: http://web.mit.edu/warlord/      PP-ASEL      N1NWH
       warlord@MIT.EDU                        PGP key available

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/