Struct alignment conflict between different compilers?

From: Olaf Titz (olaf@bigred.inka.de)
Date: Wed Mar 15 2000 - 08:30:48 EST


I'm tracking down a weird problem where CIPE leads the kernel to emit
corrupted Ethernet packets. Unfortunately tcpdump doesn't display
what's really on the wire in a completely raw format, but it looks
like some buffer or pointer has been shifted by 2 bytes: instead of

dddd dddd dddd ssss ssss ssss elel 4500 ilil idid fofo ....
--------------Eth----------------- --IP-----------------

the card seem to actually send
???? ???? ???? ssss ssss elel ???? ilil idid fofo ....

This happens only with data packets sent by the CIPE output routine,
which assembles an IP packet in an skbuff and hands them to ip_send()
just like the standard tunnel drivers do. Control packets (sent via
UDP sendmsg()) are unaffected.

I don't really understand how the data path finally gets to the card
driver, however it looks like somewhere in this process something
confuses the magical numbers "14 bytes" and "16 bytes" which are both
used in determining the hardware headers.

It also looks like this happens not always, but it could depend on
whether the hardware header cache(?) is involved.

This has never happened to me before. Now it happens with all kernels
2.2.12 through .15pre7, and with both known solid (except for a
routing cache leak) CIPE 1.3.0 and my latest development version.

The only suspicious point left is the compilation environment: the
kernel was compiled with gcc 2.95.2 on a K6 box for 486 target and the
CIPE module was compiled later on the same box with gcc 2.7.2.3, and
in addition the CIPE compilation flags by default miss both "-m486"
and "-DCPU=486". Could there by any chance be a structure alignment
conflict?

I'll try again to compile everything with the same compiler and
options if possible. What's really needed is a method to get _all_
compiler flags for compiling external modules identical to the kernel.
The bttv 0.7 method:
DIR=`pwd`; (cd $(KERNEL_LOCATION); $(MAKE) SUBDIRS=$$DIR modules)
does not completely work either because (a) it leaves stuff in the
kernel build and include directories and (b) it does not allow to
specify flags like -I/usr/local/include which would override the
kernel include path (this is needed for bttv with an externally
compiled i2c module - hopefully at least _this_ mess gets sorted out
with 2.4.)

Olaf

PS. Standard rant here: the whole mess had not been necessary if the
glibc folks didn't always break everything with every new release.
Right now it is impossible to compile for libc 6.0 on a libc 6.1
system, which is the reason I had to use a different compiler (which
compiles for libc 5; at least that still works). If only those people
would realize that it is not always possible to upgrade an entire site
of different machines at once and the need of co-existence of
different version is a fact of life.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Mar 15 2000 - 21:00:30 EST