Re: [PATCH bpf-next v2 2/3] bpf: btf: add btf print functionality

From: Jakub Kicinski
Date: Tue Jul 03 2018 - 18:23:42 EST


On Tue, 3 Jul 2018 22:46:00 +0100, Okash Khawaja wrote:
> On Mon, Jul 02, 2018 at 10:06:59PM -0700, Jakub Kicinski wrote:
> > On Mon, 2 Jul 2018 11:39:15 -0700, Okash Khawaja wrote:
> > > +#define BITS_PER_BYTE_MASK (BITS_PER_BYTE - 1)
> > > +#define BITS_PER_BYTE_MASKED(bits) ((bits) & BITS_PER_BYTE_MASK)
> >
> > Perhaps it's just me but BIT_OFFSET or BIT_COUNT as a name of this macro
> > would make it more obvious to parse in the code below.
> I don't mind either. However these macro names are also used inside
> kernel for same purpose. For sake of consistency, I'd recommend we keep
> them :)

Ugh, okay :)

> > > + } print_num;
> > > +
> > > + total_bits_offset = bit_offset + BTF_INT_OFFSET(int_type);
> > > + data += BITS_ROUNDDOWN_BYTES(total_bits_offset);
> > > + bit_offset = BITS_PER_BYTE_MASKED(total_bits_offset);
> > > + bits_to_copy = bits + bit_offset;
> > > + bytes_to_copy = BITS_ROUNDUP_BYTES(bits_to_copy);
> > > +
> > > + print_num.u64_num = 0;
> > > + memcpy(&print_num.u64_num, data, bytes_to_copy);
> >
> > This scheme is unlikely to work on big endian machines...
> Can you give an example how?

On BE:

Input: [0x01, 0x82]
Bit length: 15
Bytes to copy: 2
bit_offset: 0
upper_bits: 7

print_num.u64_num = 0;
# [0, 0, 0, 0, 0, 0, 0, 0]

memcpy(&print_num.u64_num, data, bytes_to_copy);
# [0x01, 0x82, 0, 0, 0, 0, 0, 0]

mask = (1 << upper_bits) - 1;
# mask = 0x7f

print_num.u8_nums[bytes_to_copy - 1] &= mask;
# [0x01, 0x02, 0, 0, 0, 0, 0, 0]

printf("0x%llx", print_num.u64_num);
# 0x0102000000000000 AKA 72620543991349248
# expected:
# 0x0102 AKA 258

Am I missing something?

> > > + upper_bits = BITS_PER_BYTE_MASKED(bits_to_copy);
> > > + if (upper_bits) {
> > > + uint8_t mask = (1 << upper_bits) - 1;
> > > +
> > > + print_num.u8_nums[bytes_to_copy - 1] &= mask;
> > > + }
> > > +
> > > + print_num.u64_num >>= bit_offset;
> > > +
> > > + if (is_plain_text)
> > > + jsonw_printf(jw, "0x%llx", print_num.u64_num);
> > > + else
> > > + jsonw_printf(jw, "%llu", print_num.u64_num);
> > > +}
> > > +
> > > +static int btf_dumper_int(const struct btf_type *t, uint8_t bit_offset,
> > > + const void *data, json_writer_t *jw,
> > > + bool is_plain_text)
> > > +{
> > > + uint32_t *int_type = (uint32_t *)(t + 1);
> > > + uint32_t bits = BTF_INT_BITS(*int_type);
> > > + int ret = 0;
> > > +
> > > + /* if this is bit field */
> > > + if (bit_offset || BTF_INT_OFFSET(*int_type) ||
> > > + BITS_PER_BYTE_MASKED(bits)) {
> > > + btf_dumper_int_bits(*int_type, bit_offset, data, jw,
> > > + is_plain_text);
> > > + return ret;
> > > + }
> > > +
> > > + switch (BTF_INT_ENCODING(*int_type)) {
> > > + case 0:
> > > + if (BTF_INT_BITS(*int_type) == 64)
> > > + jsonw_printf(jw, "%lu", *((uint64_t *)data));
> > > + else if (BTF_INT_BITS(*int_type) == 32)
> > > + jsonw_printf(jw, "%u", *((uint32_t *)data));
> > > + else if (BTF_INT_BITS(*int_type) == 16)
> > > + jsonw_printf(jw, "%hu", *((uint16_t *)data));
> > > + else if (BTF_INT_BITS(*int_type) == 8)
> > > + jsonw_printf(jw, "%hhu", *((uint8_t *)data));
> > > + else
> > > + btf_dumper_int_bits(*int_type, bit_offset, data, jw,
> > > + is_plain_text);
> > > + break;
> > > + case BTF_INT_SIGNED:
> > > + if (BTF_INT_BITS(*int_type) == 64)
> > > + jsonw_printf(jw, "%ld", *((int64_t *)data));
> > > + else if (BTF_INT_BITS(*int_type) == 32)
> > > + jsonw_printf(jw, "%d", *((int32_t *)data));
> > > + else if (BTF_INT_BITS(*int_type) == 16)
> >
> > Please drop the double space. Both for 16 where it makes no sense and
> > for 8 where it's marginally useful but not really.
> >
> > > + jsonw_printf(jw, "%hd", *((int16_t *)data));
> > > + else if (BTF_INT_BITS(*int_type) == 8)
> > > + jsonw_printf(jw, "%hhd", *((int8_t *)data));
> > > + else
> > > + btf_dumper_int_bits(*int_type, bit_offset, data, jw,
> > > + is_plain_text);
> > > + break;
> > > + case BTF_INT_CHAR:
> > > + if (*((char *)data) == '\0')
> > > + jsonw_null(jw);
> >
> > Mm.. I don't think 0 char is equivalent to null.
> Yes, thanks. Will fix.
>
> >
> > > + else if (isprint(*((char *)data)))
> > > + jsonw_printf(jw, "\"%c\"", *((char *)data));
> >
> > This looks very suspicious. So if I see a "6" for a char field it's
> > either a 6 ('\u0006') or a 54 ('6')...
> It will always be 54. May be I missed your point. Could you explain why
> it would be other than 54?

Ah, I think I missed that %c is in quotes...

> > > + else
> > > + if (is_plain_text)
> > > + jsonw_printf(jw, "%hhx", *((char *)data));

This seems to be missing a "0x" prefix?

> > > + else
> > > + jsonw_printf(jw, "%hhd", *((char *)data));
> >
> > ... I think you need to always print a string, and express it as
> > \u00%02hhx for non-printable.
> Okay that makes sense

Yeah, IDK, char can be used as a byte as well as a string. In eBPF
it may actually be more likely to just be used as a raw byte buffer...
Either way I think it may be nice to keep it consistent, at least for
the JSON output could we do either always ints or always characters?