Re: devfs persistence

From: benr@us.ibm.com
Date: Thu May 04 2000 - 18:56:26 EST


Hello! I'm a little late in joining this thread as I was not originally
following it, but since it has been pointed out to me I think I might have
a few things to contribute.

>"portable"?
>You are not referring to DOS-type partition tables, are you?
>Nobody knows what they are.
>DOS, Windows, Windows NT, OS/2, Solaris and Linux each interpret
>things in a slightly different way.

Well, I don't know about Solaris, and I won't presume to speak for Linux,
but DOS, Windoze, NT, and OS/2 all interpret partition tables in the same
way. The format and structure of these partitions tables is well known (I
wrote all of the disk partitioning code used in the OS/2 Logical Volume
Management System. This code has been in use in-house for about 2 years,
and has been in the field for over a year, and there have been no reported
compatibility issues. I guess this means I got at least part of it right!
;) ).

>[We want to have a disk label / volume label / UUID / wwn
>for disk and partitions. If we write some data on the disk
>itself then either we must use a partitioning scheme that
>has room for such things, e.g., a Linux-type partition table,
>or we must use some "unused" sectors on the disk (terrible),
>or we must write the information inside some partition
>(that has this and possibly also other information) -
>inconvenient: this is not the right "level" for such
>information, and partitions are a scarce resource.]

When implementing a Logical Volume Management System (LVMS) on OS/2, we
were faced with the same basic problem. This, in a nutshell, is what we
did to make it work. Of course, your mileage may vary! ;)

OS/2 allows each disk, partition, and volume to have a user defined name
(currently the names are limited to 20 characters due to OS/2 specific
issues). OS/2 also assigns each disk, partition, and volume a unique
numeric ID, similar to a serial number. Furthermore, OS/2 keeps, on each
disk in the system, the serial number assigned to the disk the system
booted from. This is used to resolve some issues when drives are removed
from one system and added to another. When the OS/2 LVMS is reconstructing
volumes during boot, the serial numbers assigned to the drives, partitions,
and volumes are used, thereby eliminating any dependence upon the physical
path to the device/partition. Finally, OS/2 stores the data required to
recreate a volume within the partitions that comprise the volume, not in
any centralized configuration file or database. Let me give you some
background so that all of this will hopefully become clearer.

IBM mandates that OS/2 be able to coexist peacefully with other operating
systems on a single machine. Originally, OS/2 would attempt to mount any
partition it found which was not formatted, or which contained a filesystem
that it thought it recognized. This made it easy to share a partition
between OS/2 and DOS as both OS/2 and DOS can use the FAT filesystem. When
it was decided to create an LVMS for OS/2, the OS/2 kernel was modified so
that it would only recognize volumes created by the LVMS. Since other
operating systems would not recognize volumes created by the LVMS, it was
decided that the LVMS would need to recognize two types of volumes:
compatibility volumes and OS/2 volumes. Compatibility volumes would
consist of a single partition, and would have to be usable by other
operating systems. Compatibility volumes would also have to have all of
the basic traits of a volume, including a user assigned volume name and a
system assigned numeric ID. Furthermore, the partition belonging to the
volume would have to have a user assigned partition name and a system
assigned numeric ID. This data could not be stored in the partition for
obvious reasons, so it had to be stored elsewhere. OS/2 stores this data
(and more) in a Drive Letter Assignment Table (DLAT). A DLAT has four
entries, and there will be one DLAT for each partition table on the disk
(MBR or EBR). The DLAT for a specific partition table is stored in the
last sector of the track containing the partition table (which will be in
the first sector of that track), and provides the name and numeric ID data
(and more) for the partitions defined in the partition table. Having one
DLAT per partition table ensures that, if a partition can be created, it
will have a DLAT entry. Using the last sector of the track containing the
partition table is safe as the rules of partitioning in the DOS, Windoze,
NT, OS/2 world prevent a partition from starting on anything other than a
cylinder or track boundary.

OS/2 Volumes are a different beast than compatibility volumes. OS/2 does
not have the concept of volume groups. Usability studies conducted with
our customers found that volume groups were not clearly understood by the
typical user, and that users found them difficult and confusing to use. As
a result, an entirely new LVMS architecture was designed. This
architecture is partition based, and has no volume groups. The OS/2 LVMS
is based upon a subset of this architecture. (IBM is considering
standardizing on this architecture.) Anyway, getting back to OS/2 volumes,
OS/2 volumes consist of one or more partitions. They still have DLAT
entries for each partition in the volume, but the actual LVM data
(including what partitions are part of a volume) is stored at the end of
each partition in the volume in what we call the LVM Data Area. A copy of
the volume's DLAT entry is also kept in the LVM Data Area for redundancy
purposes. The reason that the LVM Data is kept at the end of the partition
instead of the beginning of the partition is that the amount of data stored
by the OS/2 LVMS is related to the size of the partition. Since resizing
of partitions was a feature under consideration at the time, there existed
the possibility that the LVM might need to change the size of the LVM Data
Area to accommodate a change in the size of the partition. Placing the LVM
Data Area at the end of the partition allowed us to do this.

This post has gotten quite long, and I apologize for that. I hope that I
have given you enough information so that you can see how OS/2 accomplishes
what you want. If there is any interest, I can provide more details and
specifics next week. Also, IBM has decided to release to the Linux
community its new Logical Volume Management System Architecture. IBM is
doing this in the hope that it will, in some way, be useful to the Linux
Community. Heinz Mauelshagen (Linux LVM), and those he designates, will be
receiving for review a white paper describing the IBM LVMS by the end of
next week. When Heinz is through with it, the white paper will be released
to the Linux Community at large.

Regards,

Ben Rafanello
benr@us.ibm.com

Andries Brouwer <aeb@veritas.com>@vger.rutgers.edu on 05/03/2000 06:25:24
pm

Sent by: owner-linux-kernel@vger.rutgers.edu

To: "Stephen C. Tweedie" <sct@redhat.com>
cc: Stephen Harris <sweh@spuddy.mew.co.uk>,
      linux-kernel@vger.rutgers.edu, rgooch@ras.ucalgary.ca
Subject: Re: devfs persistence

On Wed, May 03, 2000 at 10:35:18PM +0100, Stephen C. Tweedie wrote:

> We don't need to discard the
> existing, portable partitioning mechanisms in order to achieve this.

"portable"?
You are not referring to DOS-type partition tables, are you?
Nobody knows what they are.
DOS, Windows, Windows NT, OS/2, Solaris and Linux each interpret
things in a slightly different way.
It would be extremely desirable to throw them out and replace them
by something that is well-defined.

> The LVM superblock is exactly what you are after, isn't it? You
> place one of those on a DOS partition, and then ...

We are still not reaching each other.
Let me reiterate.

Stephen: There is no place to put an UUID.

Andries: Use a Linux-type partition table.

Stephen: That is not a DOS-type partition table.

Andries: Good riddance!

[We want to have a disk label / volume label / UUID / wwn
for disk and partitions. If we write some data on the disk
itself then either we must use a partitioning scheme that
has room for such things, e.g., a Linux-type partition table,
or we must use some "unused" sectors on the disk (terrible),
or we must write the information inside some partition
(that has this and possibly also other information) -
inconvenient: this is not the right "level" for such
information, and partitions are a scarce resource.]

Andries

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun May 07 2000 - 21:00:16 EST