RE: [PATCH] C undefined behavior fix

From: Bernard Dautrevaux (
Date: Thu Jan 03 2002 - 05:35:36 EST

> -----Original Message-----
> From: Paul Mackerras []
> Sent: Thursday, January 03, 2002 4:12 AM
> To: Joe Buck
> Cc:;;;
> Subject: Re: [PATCH] C undefined behavior fix
> Joe Buck writes:
> > There is already such a project under development: see
> >
> >
> >
> > This is a modification to gcc that implements pointers as triples.
> > While there is a performance penalty for doing this, it can
> completely
> > eliminate the problem of exploitable buffer overflows.
> However, programs
> > that violate the rules of ISO C by generating out-of-range
> pointers will
> > fail.
> What will it do if I cast a pointer to unsigned long? Or if I cast an
> unsigned long to a pointer? The kernel does both of these things, and
> in a lot of places.
> Part of my beef with what gcc-3 is doing is that I take a pointer,
> cast it to unsigned long, do something to it, cast it back to a
> pointer, and gcc _still_ thinks it's knows what I am doing. It
> doesn't.

So say it to him: the problem is that the C language gives a quite precise
meaning to the RELOC macro code: When writing RELOC("some string") you say
to the C compiler: look, I want a pointer *in the string* at offset
0xC0000000. The GCC does exactly that. Afterwards it use this for optimizing
strcpy, but you fooled it before and it may be fooled for other cases (e.g.
if you do strlen(RELOC("xxx")) or RELOC("0123456789ABCDEF")[x] in a
"print_hexadecimal" function...).

There is ways in GCC to be sure the compiler forgets what it knows about a
value and use it as you provide it; one of the cleaner is using assembly
language. Note that someone proposed the use of a no-operation asm
instruction that just pretend to change the value returned. If the asm
instruction is furthermore declared "volatile" (I don't remember if it was)
then GCC guarantees that the instruction will not be optimzed away in any
case and so, as the kernel *knows* that RELOC("some string") is effectively
yielding the proper bit-level representation for a valid pointer to the
string th eprogrammer wants, GCC will trust the code and all will be OK.

And all this is just fixing ONE line in the kernel. Is that too much of a
job to do witout discussing it at will in several dozen messages?

Just my .02 euros


Bernard Dautrevaux
Microprocess Ingenierie
97 bis, rue de Colombes
Tel: +33 (0) 1 47 68 80 80
Fax: +33 (0) 1 47 88 97 85
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at

This archive was generated by hypermail 2b29 : Mon Jan 07 2002 - 21:00:20 EST