Re: ERROR: INT DW_ATE_unsigned_1 Error emitting BTF type

From: Yonghong Song
Date: Sat Feb 06 2021 - 14:20:03 EST




On 2/6/21 10:10 AM, Sedat Dilek wrote:
On Sat, Feb 6, 2021 at 6:53 PM Yonghong Song <yhs@xxxxxx> wrote:



On 2/6/21 8:24 AM, Mark Wieelard wrote:
Hi,

On Sat, Feb 06, 2021 at 12:26:44AM -0800, Yonghong Song wrote:
With the above vmlinux, the issue appears to be handling
DW_ATE_signed_1, DW_ATE_unsigned_{1,24,40}.

The following patch should fix the issue:

That doesn't really make sense to me. Why is the compiler emitting a
DW_TAG_base_type that needs to be interpreted according to the
DW_AT_name attribute?

If the issue is that the size of the base type cannot be expressed in
bytes then the DWARF spec provides the following option:

If the value of an object of the given type does not fully occupy
the storage described by a byte size attribute, the base type
entry may also have a DW_AT_bit_size and a DW_AT_data_bit_offset
attribute, both of whose values are integer constant values (see
Section 2.19 on page 55). The bit size attribute describes the
actual size in bits used to represent values of the given
type. The data bit offset attribute is the offset in bits from the
beginning of the containing storage to the beginning of the
value. Bits that are part of the offset are padding. If this
attribute is omitted a default data bit offset of zero is assumed.

Would it be possible to use that encoding of those special types? If

I agree with you. I do not like comparing me as well. Unfortunately,
there is no enough information in dwarf to find out actual information.
The following is the dwarf dump with vmlinux (Sedat provided) for
DW_ATE_unsigned_1.

0x000e97e9: DW_TAG_base_type
DW_AT_name ("DW_ATE_unsigned_1")
DW_AT_encoding (DW_ATE_unsigned)
DW_AT_byte_size (0x00)

There is no DW_AT_bit_size and DW_AT_bit_offset for base type.
AFAIK, these two attributes typically appear in struct/union members
together with DW_AT_byte_size.

Maybe compilers (clang in this case) can emit DW_AT_bit_size = 1
and DW_AT_bit_offset = 0/7 (depending on big/little endian) and
this case, we just test and get DW_AT_bit_size and it should work.

But I think BTF does not need this (DW_ATE_unsigned_1) for now.
I checked dwarf dump and it is mostly used for some arith operation
encoded in dump (in this case, e.g., shift by 1 bit)

0x000015cf: DW_TAG_base_type
DW_AT_name ("DW_ATE_unsigned_1")
DW_AT_encoding (DW_ATE_unsigned)
DW_AT_byte_size (0x00)

0x00010ed9: DW_TAG_formal_parameter
DW_AT_location (DW_OP_lit0, DW_OP_not,
DW_OP_convert (0x000015cf) "DW_ATE_unsigned_1", DW_OP_convert
(0x000015d4) "DW_ATE_unsigned_8", DW_OP_stack_value)
DW_AT_abstract_origin (0x00013984 "branch")

Look at clang frontend, only the following types are encoded with
unsigned dwarf type.

case BuiltinType::UShort:
case BuiltinType::UInt:
case BuiltinType::UInt128:
case BuiltinType::ULong:
case BuiltinType::WChar_U:
case BuiltinType::ULongLong:
Encoding = llvm::dwarf::DW_ATE_unsigned;
break;


not, can we try to come up with some extension that doesn't require
consumers to match magic names?


You want me to upload mlx5_core.ko?

I just sent out a patch. You are cc'ed. I also attached in this email.
Yes, it would be great if you can upload mlx5_core.ko so I can
double check with this DW_ATE_unsigned_160 which is really usual.


When looking with llvm-dwarf for DW_ATE_unsigned_160:

0x00d65616: DW_TAG_base_type
DW_AT_name ("DW_ATE_unsigned_160")
DW_AT_encoding (DW_ATE_unsigned)
DW_AT_byte_size (0x14)

If you need further information, please let me know.

Thanks.

- Sedat -

From 239c797090abbdc5253d0ff1e9e657c5006fbbee Mon Sep 17 00:00:00 2001
From: Yonghong Song <yhs@xxxxxx>
Date: Sat, 6 Feb 2021 10:21:45 -0800
Subject: [PATCH dwarves] btf_encoder: sanitize non-regular int base type

clang with dwarf5 may generate non-regular int base type,
i.e., not a signed/unsigned char/short/int/longlong/__int128.
Such base types are often used to describe
how an actual parameter or variable is generated. For example,

0x000015cf: DW_TAG_base_type
DW_AT_name ("DW_ATE_unsigned_1")
DW_AT_encoding (DW_ATE_unsigned)
DW_AT_byte_size (0x00)

0x00010ed9: DW_TAG_formal_parameter
DW_AT_location (DW_OP_lit0,
DW_OP_not,
DW_OP_convert (0x000015cf) "DW_ATE_unsigned_1",
DW_OP_convert (0x000015d4) "DW_ATE_unsigned_8",
DW_OP_stack_value)
DW_AT_abstract_origin (0x00013984 "branch")

What it does is with a literal "0", did a "not" operation, and the converted to
one-bit unsigned int and then 8-bit unsigned int.

Another example,

0x000e97e4: DW_TAG_base_type
DW_AT_name ("DW_ATE_unsigned_24")
DW_AT_encoding (DW_ATE_unsigned)
DW_AT_byte_size (0x03)

0x000f88f8: DW_TAG_variable
DW_AT_location (indexed (0x3c) loclist = 0x00008fb0:
[0xffffffff82808812, 0xffffffff82808817):
DW_OP_breg0 RAX+0,
DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64",
DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8",
DW_OP_stack_value,
DW_OP_piece 0x1,
DW_OP_breg0 RAX+0,
DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64",
DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32",
DW_OP_lit8,
DW_OP_shr,
DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32",
DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24",
DW_OP_stack_value,
DW_OP_piece 0x3
......

At one point, a right shift by 8 happens and the result is converted to
32-bit unsigned int and then to 24-bit unsigned int.

BTF does not need any of these DW_OP_* information and such non-regular int
types will cause libbpf to emit errors.
Let us sanitize them to generate BTF acceptable to libbpf and kernel.

Cc: Sedat Dilek <sedat.dilek@xxxxxxxxx>
Signed-off-by: Yonghong Song <yhs@xxxxxx>
---
libbtf.c | 39 ++++++++++++++++++++++++++++++++++++++-
1 file changed, 38 insertions(+), 1 deletion(-)

diff --git a/libbtf.c b/libbtf.c
index 9f76283..93fe185 100644
--- a/libbtf.c
+++ b/libbtf.c
@@ -373,6 +373,7 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt,
struct btf *btf = btfe->btf;
const struct btf_type *t;
uint8_t encoding = 0;
+ uint16_t byte_sz;
int32_t id;

if (bt->is_signed) {
@@ -384,7 +385,43 @@ int32_t btf_elf__add_base_type(struct btf_elf *btfe, const struct base_type *bt,
return -1;
}

- id = btf__add_int(btf, name, BITS_ROUNDUP_BYTES(bt->bit_size), encoding);
+ /* dwarf5 may emit DW_ATE_[un]signed_{num} base types where
+ * {num} is not power of 2 and may exceed 128. Such attributes
+ * are mostly used to record operation for an actual parameter
+ * or variable.
+ * For example,
+ * DW_AT_location (indexed (0x3c) loclist = 0x00008fb0:
+ * [0xffffffff82808812, 0xffffffff82808817):
+ * DW_OP_breg0 RAX+0,
+ * DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64",
+ * DW_OP_convert (0x000e97df) "DW_ATE_unsigned_8",
+ * DW_OP_stack_value,
+ * DW_OP_piece 0x1,
+ * DW_OP_breg0 RAX+0,
+ * DW_OP_convert (0x000e97d5) "DW_ATE_unsigned_64",
+ * DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32",
+ * DW_OP_lit8,
+ * DW_OP_shr,
+ * DW_OP_convert (0x000e97da) "DW_ATE_unsigned_32",
+ * DW_OP_convert (0x000e97e4) "DW_ATE_unsigned_24",
+ * DW_OP_stack_value, DW_OP_piece 0x3
+ * DW_AT_name ("ebx")
+ * DW_AT_decl_file ("/linux/arch/x86/events/intel/core.c")
+ *
+ * In the above example, at some point, one unsigned_32 value
+ * is right shifted by 8 and the result is converted to unsigned_32
+ * and then unsigned_24.
+ *
+ * BTF does not need such DW_OP_* information so let us sanitize
+ * these non-regular int types to avoid libbpf/kernel complaints.
+ */
+ byte_sz = BITS_ROUNDUP_BYTES(bt->bit_size);
+ if (!byte_sz || (byte_sz & (byte_sz - 1))) {
+ name = "sanitized_int";
+ byte_sz = 4;
+ }
+
+ id = btf__add_int(btf, name, byte_sz, encoding);
if (id < 0) {
btf_elf__log_err(btfe, BTF_KIND_INT, name, true, "Error emitting BTF type");
} else {
--
2.24.1