Re: Linux/IA-64 byte order

Wayne Schlitt (wayne@midwestcs.com)
10 Mar 1999 09:30:34 -0600


In <19990309114731.B19807@anjala.mit.edu> Arvind Sankar <arvinds@mit.edu> writes:
>
> duh? You're saying that big-endian is preferable because it is easier for
> humans to read?

Yes. Other than backwards compatibility, BE being easier to read is
the only "significant" practical difference BE and LE. Being easier
to read is not significant enough to outweigh backwards compatibility
at all.

> Consider a struct in some server type thing that used to have two fields
> u32 foo;
> u32 reserved;

For BE, you have to swap the reserved and foo around, and for all
practical purposes, you have the same advantages.

While your particular example doesn't show any advantages of LE over
BE, it is possible to construct a case where LE does have an
advantage, but the advantage is of little practical use.

IF you have a pointer to a integer of "arbitrary" size,

and IF you have space allocated after that pointer for LE, or before
for BE,

and IF you make sure the extra space is always set to zero initially,

and IF, at run time, you can determine the size of that integer,

and IF, at run time, you want to be able to change the size of that
integer,

THEN LE can do this with SLIGHTLY less overhead than BE.

It comes down to code in some way equivalent to the following:

#ifdef LITTLE_ENDIAN
void *p;

switch (runtime_sizeof( p ))
{
case CHAR:
*(char *)p += 5;
break;

case SHORT:
*(short *)p += 5;
break;

case LONG:
*(long *)p += 5;
break;

case LONGLONG:
*(long long *)p += 5;
break;
}

#elif BIG_ENDIAN
void *p;

switch (runtime_sizeof( p ))
{
case CHAR:
*(char *)p += 5;
break;

case SHORT:
*(short *)((char *)p - (sizeof( short ) - sizeof( char ))) += 5;
break;

case LONG:
*(long *)((char *)p - (sizeof( long ) - sizeof( char ))) += 5;
break;

case LONGLONG:
*(long long *)((char *)p - (sizeof( long long ) - sizeof( char ))) += 5;
break;
}
#endif

Instead of using a switch statement, you could also use self-modifying
code and zap the opcodes of the add instructions to be the correct
type, but watch out for the case of "long long". You could also
modify "p" only through pointers to functions and change which
pointers the functions point to at run time. There are probably other
ways of doing this, but as I said above, they all have to be
equivalent to the above.

The problem is that the overhead of doing this is just horrendous.
You are better off just using the largest datatype. The BE format
merely requires one extra subtract of a constant, which isn't much
compared to the overhead.

Linus is right (surprise, surprise), the only compelling reason to
choose between BE and LE is backwards compatibility. The only other
reason that is even on the radar is that BE is "easier to read in hex
dumps" and it is so far out there that few people care.

If I could wave a magic wand and convert all the worlds computers (and
data) to either LE or BE, but I had no idea which one it would be, I
would certainly wave that wand. It would make life much easier. If I
could *choose* which outcome it would be, I would choose BE, but right
now the world seems to be going to LE and I don't care.

-wayne

-- 
Wayne Schlitt can not assert the truth of all statements in this
article and still be consistent.

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/