Note there are targets (even some old x86 variants) that required the profiling calls to occur after the prologue. Unfortunately, nobody documented *why* that was the case. Sigh.
Calling the profiler immediately at the entry point is clearly the more
sane option. It means the ABI is well-defined, stable, and independent
of what the actual function contents are. It means that ABI isn't the
normal C ABI (the __fentry__ function would have to preserve all
registers), but that's fine...