Re: get/put unaligned helpers

From: Maciej W. Rozycki
Date: Thu Feb 12 2009 - 09:40:13 EST


On Thu, 12 Feb 2009, Boaz Harrosh wrote:

> I was under the impression they need to be aligned because otherwise that means
> something is done wrong. Because the aligned version on lots of CPUs is one instruction
> where unaligned access is better, or must, be emulated (byte accessed).
>
> Assembly wise the two accesses are different and sometimes the compiler has no way to
> know, where the programmer can know for sure.
>
> But I like to be educated any day, please explain what to use when.

The compiler is always told by the programmer what alignment to expect --
the language defines the alignment of each data type it may use. For
example for integer types the alignment is always equal to the size of the
type. Casting a pointer and subsequently accessing data pointed at leads
to undefined behaviour if it increases the required alignment -- you'll
get a warning from GCC if you ask it for this class of warnings. If you
want to access this data anyway you have to tell the compiler the data is
not correctly aligned.

You can use packed structs (or unions) to ask GCC to emit code sequences
suitable for accessing unaligned data entities of various sizes. For
example with the MIPS processor if a 32-bit-wide entity is read via a
member of a packed type, a sequence consisting of an LWL (load word left)
and an LWR (load word right) instruction will be generated to perform two
complementing bus read cycles with byte enables appropriately set to fetch
the two parts of the entity spanning a 32-bit alignment boundary.
Normally a single LW (load word) instruction would be used to fetch the
entity, but that instruction would trap if the address used to access it
was not aligned. For platforms where hardware is capable of doing
unaligned accesses with no special treatment the use of packed types does
not imply a change in the emitted code.

Of course letting GCC sort out unaligned accesses itself has a lot of
advantages, for example other instructions may get scheduled between the
LWL and LWR mentioned above as GCC sees fit; the two have no special
scheduling requirements with respect to each other (the reverse is
actually the case -- care has been taken since the beginning of the
architecture so that making them adjacent to each other is permitted and
incurs no performance penalty), but other instructions which surround them
may and making such a reorder may improve performance. Such accesses may
get merged or killed too as any other ones.

Linux seems to put a lot of infrastructure around it, but in reality it
is just a couple of lines of common code; I suppose it is there more to
make people aware of the issue (all the world is not x86) than to invent
something new. And also Linux is a single program so it makes sense not
to scatter duplicates and have local copies of the same code for each
platform/subsystem/driver/etc. You'd do the same with any other program,
but promoting the couple of lines to become a system header sounds like an
overkill to me.

Maciej
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/