Re: [patch 27/95] scanf: fix type range overflow

From: Alexey Dobriyan
Date: Thu Sep 10 2015 - 07:28:40 EST


On Wed, Sep 09, 2015 at 04:52:29PM -0700, Linus Torvalds wrote:
> On Wed, Sep 9, 2015 at 3:36 PM, <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> > From: Alexey Dobriyan <adobriyan@xxxxxxxxx>
> > Subject: scanf: fix type range overflow
>
> This is another patch that claims to fix a bug, but looks
> fundamentally broken unless I'm misreading something.
>
> Like it or not, people write code like
>
> unsigned int x;
>
> and then may well initialize it with "-1" or with 0xffffffff depending
> on how they feel and what helper macros etc they used.
>
> And I see no reason to believe that the same wouldn't be true of
> sscanf(). In fact, I'm pretty sure it happens exactly for things like
> initializing bitmasks etc.
>
> But you seem to want to make this an error, because your seem to
> verify that the underlying numerical value it fits in whatever
> (signed/unsigned) type that is being converted.
>
> Maybe I misread the code, but that's what it looks like.

It does exactly that (which I think is saner default).

If you have very strict rules as default, someone who wants to implement
more relaxed parsing can easily do that. In case of parse_integer()
simple take next wider type. If it is "unsigned long long", well, look
for sign "-". It is certainly doable (and made explicit).

But if relaxed parsing is the foundation someone who wants very strict
parsing can't do that and forced to reimplement from scratch.
If all you have is strict_strtoull() what do you do? You scrap it.

> And it's wrong. People may use "%d" for "unsigned int" exactly because
> they want the "-1" to work, but "4294967294" should work too.
>
> > Fun fact:
> >
> > uint8_t val;
> > sscanf("256", "%hhu", &val);
> >
> > will return 1 and make val=0 (clearly bogus).
>
> Fun fact: overflows are not at all always clearly bogus.

> Come to think of it, I think that the parse_integer() interface may
> have had the same bug.
>
> Too much "I know better than you" is actually a bug.

Maybe, but what to do with the example?
Should kernel write 0 when there are exactly no "0" digits
present in the string? It does that currently. We can't even ask
userspace for advice because glibc will return ERANGE for "unsigned int"
and higher and silently overflow for "unsigned char" and "unsigned short".

Overall, I don't think it is a problem at all.
kstrto*() and kstrto*_from_user() functions were merged 4.5 years ago and
all of them derive valid input range from type of the argument. There were
no problems so far.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/