Re: UBSAN: array-index-out-of-bounds in alg_bind

From: Dmitry Vyukov
Date: Sat Oct 17 2020 - 10:42:16 EST


On Sat, Oct 17, 2020 at 1:02 PM Jann Horn <jannh@xxxxxxxxxx> wrote:
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=92ead4eb8e26a26d465e
> > > > [...]
> > > > Reported-by: syzbot+92ead4eb8e26a26d465e@xxxxxxxxxxxxxxxxxxxxxxxxx
> > > > [...]
> > > > UBSAN: array-index-out-of-bounds in crypto/af_alg.c:166:2
> > > > index 91 is out of range for type '__u8 [64]'
> > >
> > > This seems to be an "as intended", if very odd. false positive (the actual
> > > memory area is backed by the on-stack _K_SS_MAXSIZE-sized sockaddr_storage
> > > "address" variable in __sys_bind. But yes, af_alg's salg_name member
> > > size here doesn't make sense.
> >
> > As Vegard noted elsewhere, compilers can start making assumptions
> > based on absence of UB and compile code in surprising ways as the
> > result leading to very serious and very real bugs.
> >
> > One example would be a compiler generating jump table for common sizes
> > during PGO and leaving size > 64 as wild jump.
> >
> > Another example would be a compiler assuming that copy size <= 64.
> > Then if there is another copy into a 64-byte buffer with a proper size
> > check, the compiler can now drop that size check (since it now knows
> > size <= 64) and we get real stack smash (for a copy that does have a
> > proper size check before!).
>
> FWIW, the kernel currently still has a bunch of places that use
> C89-style length-1 arrays (which were in the past used to work around
> C89's lack of proper flexible arrays). Gustavo A. R. Silva has a bunch
> of patches pending to change those places now, but those are not
> marked for stable backporting; so in all currently released kernels,
> we'll probably keep having length-1 arrays at the ends of C structs
> that are used as if they were flexible arrays. (Unless someone makes
> the case that these patches are not just cleanups but actually fix
> some sort of real bug, and therefore need to be backported.)
>
> The code in this example looks just like one of those C89-style
> length-1 arrays to me (except that the length isn't 1).
>
> Of course I do agree that this should be cleaned up, and that having
> bogus array lengths in the source code is a bad idea.
>
> > And we do want compilers to be that smart today. Because of all levels
> > of abstractions/macros/inlining we actually have lots of
> > redundant/nonsensical code in the end after all inlining and
> > expansions, and we do want compilers to infer things, remove redundant
> > checks, etc so that we can have both nice abstract source code and
> > efficient machine code at the same time.
>
> I guess that kinda leads to the question: Do we just need to fix the
> kernel code here (which is comparatively easy), or do you think that
> this is a sufficiently big problem that we need to go and somehow
> change the actual UAPI headers here (e.g. by deprecating the existing
> UAPI struct and making a new one with a different name)?

Good question. What I wrote is not based on some concrete
miscompilation at hand. I just meant that there are more things
involved that may appear at first glance.

Re proactively fixing UAPI, I would say if somebody is up to doing it
now, I would say it's good and a right change. Otherwise delaying
fixing it is also a reasonable strategy because (1) there are probably
more such cases, (2) any work on enabling more optimizations, global
optimizations, etc is only feasible if there is a tool that helps to
identify all places that need to be fixed. So whoever/whenever will be
fixing this, one more or one less case probably does not matter much.
It's a different story if there is already a tool/compiler warning
that traps on some code and that code harms deployment of the tool.