Re: [PATCH] libbpf hashmap: Fix undefined behavior in hash_bits

From: Ian Rogers
Date: Thu Oct 29 2020 - 15:37:40 EST


On Thu, Oct 29, 2020 at 10:45 AM Song Liu <songliubraving@xxxxxx> wrote:
>
> > On Oct 29, 2020, at 9:09 AM, Ian Rogers <irogers@xxxxxxxxxx> wrote:
> >
> > If bits is 0, the case when the map is empty, then the >> is the size of
> > the register which is undefined behavior - on x86 it is the same as a
> > shift by 0. Fix by handling the 0 case explicitly when running with
> > address sanitizer.
> >
> > A variant of this patch was posted previously as:
> > https://lore.kernel.org/lkml/20200508063954.256593-1-irogers@xxxxxxxxxx/
> >
> > Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
> > ---
> > tools/lib/bpf/hashmap.h | 14 ++++++++++++++
> > 1 file changed, 14 insertions(+)
> >
> > diff --git a/tools/lib/bpf/hashmap.h b/tools/lib/bpf/hashmap.h
> > index d9b385fe808c..27d0556527d3 100644
> > --- a/tools/lib/bpf/hashmap.h
> > +++ b/tools/lib/bpf/hashmap.h
> > @@ -12,9 +12,23 @@
> > #include <stddef.h>
> > #include <limits.h>
> >
> > +#ifdef __has_feature
> > +#define HAVE_FEATURE(f) __has_feature(f)
> > +#else
> > +#define HAVE_FEATURE(f) 0
> > +#endif
> > +
> > static inline size_t hash_bits(size_t h, int bits)
> > {
> > /* shuffle bits and return requested number of upper bits */
> > +#if defined(ADDRESS_SANITIZER) || HAVE_FEATURE(address_sanitizer)
>
> I am not very familiar with these features. Is address sanitizer same
> as undefined behavior sanitizer (mentioned in previous version)?

My preference would be to special case bits == 0 without the feature
guards as per the original change, this is the most correct. There is
some feature support for detecting ubsan:
https://github.com/google/sanitizers/issues/765
In my case I see this with address sanitizer and older versions of
clang don't expose ubsan as a feature.

> > + /*
> > + * If the requested bits == 0 avoid undefined behavior from a
> > + * greater-than bit width shift right (aka invalid-shift-exponent).
> > + */
> > + if (bits == 0)
> > + return -1;
>
> Shall we return 0 or -1 (0xffffffff) here?

The value isn't used and so doesn't matter. -1 seemed less likely to
silently succeed.

> Also, we have HASHMAP_MIN_CAP_BITS == 2. Shall we just make sure we
> never feed bits == 0 into hash_bits()?

I think that'd be a different change. I'd be happy to see it.

Thanks,
Ian

> Thanks,
> Song
>
>
> > +#endif
> > #if (__SIZEOF_SIZE_T__ == __SIZEOF_LONG_LONG__)
> > /* LP64 case */
> > return (h * 11400714819323198485llu) >> (__SIZEOF_LONG_LONG__ * 8 - bits);
> > --
> > 2.29.1.341.ge80a0c044ae-goog
> >
>