Re: [RFC PATCH v3 00/22] arm64: livepatch: Use ORC for dynamic frame pointer validation

From: Nick Desaulniers
Date: Thu Apr 13 2023 - 13:04:19 EST


On Thu, Mar 23, 2023 at 05:17:14PM +0000, Mark Rutland wrote:
> Hi Madhavan,
>
> At a high-level, I think this still falls afoul of our desire to not reverse
> engineer control flow from the binary, and so I do not think this is the right
> approach. I've expanded a bit on that below.
>
> I do think it would be nice to have *some* of the objtool changes, as I do
> think we will want to use objtool for some things in future (e.g. some
> build-time binary patching such as table sorting).
>
> > Problem
> > =======
> >
> > Objtool is complex and highly architecture-dependent. There are a lot of
> > different checks in objtool that all of the code in the kernel must pass
> > before livepatch can be enabled. If a check fails, it must be corrected
> > before we can proceed. Sometimes, the kernel code needs to be fixed.
> > Sometimes, it is a compiler bug that needs to be fixed. The challenge is
> > also to prove that all the work is complete for an architecture.
> >
> > As such, it presents a great challenge to enable livepatch for an
> > architecture.
>
> There's a more fundamental issue here in that objtool has to reverse-engineer
> control flow, and so even if the kernel code and compiled code generation is
> *perfect*, it's possible that objtool won't recognise the structure of the
> generated code, and won't be able to reverse-engineer the correct control flow.
>
> We've seen issues where objtool didn't understand jump tables, so support for
> that got disabled on x86. A key objection from the arm64 side is that we don't
> want to disable compile code generation strategies like this. Further, as
> compiles evolve, their code generation strategies will change, and it's likely
> there will be other cases that crop up. This is inherently fragile.
>
> The key objections from the arm64 side is that we don't want to
> reverse-engineer details from the binary, as this is complex, fragile, and
> unstable. This is why we've previously suggested that we should work with
> compiler folk to get what we need.

> This still requires reverse-engineering the forward-edge control flow in order
> to compute those offets, so the same objections apply with this approach. I do
> not think this is the right approach.
>
> I would *strongly* prefer that we work with compiler folk to get the
> information that we need.

IDK if it's relevant here, but I did see a commit go by to LLVM that
seemed to include such info in a custom ELF section (for the purposes of
improving fuzzing, IIUC). Maybe such an encoding scheme could be tested
to see if it's reliable or usable?
- https://github.com/llvm/llvm-project/commit/3e52c0926c22575d918e7ca8369522b986635cd3
- https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-control-flow

>
> [...]
>
> > FWIW, I have also compared the CFI I am generating with DWARF
> > information that the compiler generates. The CFIs match a
> > 100% for Clang. In the case of gcc, the comparison fails
> > in 1.7% of the cases. I have analyzed those cases and found
> > the DWARF information generated by gcc is incorrect. The
> > ORC generated by my Objtool is correct.
>
>
> Have you reported this to the GCC folk, and can you give any examples?
> I'm sure they would be interested in fixing this, regardless of whether we end
> up using it.

Yeah, at least a bug report is good. "See something, say something."