Re: [PATCH v2 2/7] linkage: add SYM_{ENTRY,START,END}_AT()

From: Mark Rutland
Date: Fri Feb 11 2022 - 06:32:41 EST


On Thu, Feb 10, 2022 at 05:20:10PM -0800, Nick Desaulniers wrote:
> On Thu, Feb 10, 2022 at 6:52 AM Mark Rutland <mark.rutland@xxxxxxx> wrote:
> >
> > Both GCC and clang are happy to treat labels as constant expressions:
> >
> > | [mark@lakrids:~/asm-test]% cat test-label.S
> > | .text
> > |
> > | start:
> > | nop
> > | end:
> > |
> > | .if (end - start) == 0
> > | .err
> > | .endif
> > |
> > | [mark@lakrids:~/asm-test]% usekorg 11.1.0 aarch64-linux-gcc -c test-label.S
> > | [mark@lakrids:~/asm-test]% usellvm 13.0.0 clang --target=aarch64-linux -c test-label.S
> >
> > ... but only GCC is happy to treat symbol definitions as constants:
> >
> > | [mark@lakrids:~/asm-test]% cat test-symbol.S
> > | .text
> > |
> > | .set start, .;
> > | nop
> > | .set end, .;
> > |
> > | .if (end - start) == 0
> > | .err
> > | .endif
> > |
> > | [mark@lakrids:~/asm-test]% usekorg 11.1.0 aarch64-linux-gcc -c test-symbol.S
> > | [mark@lakrids:~/asm-test]% usellvm 13.0.0 clang --target=aarch64-linux -c test-symbol.S
> > | test-symbol.S:7:6: error: expected absolute expression
> > | .if (end - start) == 0
> > | ^
> > | test-symbol.S:8:2: error: .err encountered
> > | .err
> > | ^
> >
> > This is obviously a behavioural difference, but I'm not sure whether it's
> > intentional, or just an artifact of the differing implementation of GNU as and
> > LLVM's integrated assembler. Nich, Nathan, any thoughts on that?
> >
> > Does clang have any mechanism other than labels to define location constants
> > that can be used as absolute expressions? e.g. is there any mechanism to alias
> > a label which results in the alias also being a constant?
>
> s/constant/absolute/

Sorry, yes. I wasn't clear on the terminology when I wrote this, and I realise
now what I said was a bit confused.

IIUC the symbols themselves are relocatable (rather than absolute) whether
they're labels or created via `.set`, since the base of the section hasn't been
set yet. The difference between the two *should* be absolute (since they're
both relocatable relative to the same base), and LLVM can figure that out for
two labels, but not when either is created via `.set`, so it seems some
information is tracked differently for labels and othey symbols?

I note LLVM *can* treat the result of a `.set` as absolute, eg. if I do:

.set sym_offset (label_end - label_start)
.if sym_offset == 0
.err
.endif

... LLVM treats `sym_offset` as an absolute value.

> Nothing off the top of my head comes to mind as a substitute that will
> work as expected today.
>
> I've filed https://github.com/llvm/llvm-project/issues/53728 to
> discuss more with folks that understand our AsmParser better.

Thanks!

> From what I can tell, our AsmParser is falling down because the
> operands of the binary expression themselves are not considered
> absolute.

IIUC even in the label case the operands aren't absolute, but rather that
they're relocatable relative to the same base, and hence the expression can be
evaluate to be absolute (since the base gets cancelled out, removing the
relocation).

So is there something that gets tracked differently for labels and other
symbols?

> I doubt we will be able to handle the general case of symbol
> differencing; the symbol reference operands could refer to symbols in
> different sections that haven't been laid out yet (or whose order
> could be shuffled by the linker). But if they're in the same section
> (or "fragment" of a section), we might be able to special case this.

Sure, I think the only case that needs to work (or can work conceptually) is
when they're in the same section (of fragment thereof), and that's all the GNU
assembler supports.

> For the expression
>
> > .if (qwerty_fiqin_end - qwerty_fiqin_start) > (0x200 - 0x1c)
>
> can you use local labels (`.L` prefix) rather than symbolic
> references? or is there a risk of them not being unique per TU?

For the problem in this patch I might be able to do something of that shape,
but I'll need to factor the SYM_*() helpers differently so that I can use
labels for the primary definition.

I can't do that for the aliases, since I don't know the set of aliases until
after the primary definition is created. Since there's no label aliasing
mechanism, I must use symbol references. However, I think I can propagate the
size by calculating alongside the primary definition, e.g.

primary_start:
nop
primary_end:
.set primary_size, (primary_end - primary_start)

.set alias_start, primary_start
.set alias_end, primary_end
.set alias_size, primary_size

... so I'll have a go at that.

However, that still means I can't treat aliases as entirely equivalent to the
primary definition, e.g.

.if (alias_end - alias_start) == 0
.err
.endif

... and there might be more things of that sort of shape, so it'd be good if
LLVM could handle this for symbol references (in the same section or fragment
thereof), so that we have the chance to use that in future.

Thanks,
Mark.