Re: [RFC PATCH v4 0/2] arm64: Stack trace reliability checks in the unwinder

From: Madhavan T. Venkataraman
Date: Fri May 21 2021 - 13:32:56 EST




On 5/21/21 12:18 PM, Mark Brown wrote:
> On Sat, May 15, 2021 at 11:00:16PM -0500, madvenka@xxxxxxxxxxxxxxxxxxx wrote:
>
>> Special cases
>> =============
>>
>> Some special cases need to be mentioned:
>
> I think it'd be good if more of this cover letter, especially sections
> like this which cover the tricky bits, ended up in the code somehow -
> it's recorded here and will be in the list archive but that's not the
> most discoverable place so increases the maintainance burden. It'd be
> great to be able to compare the code directly with the reliable
> stacktrace requirements document and see everything getting ticked off,
> actually going all the way there might be too much and loose the code in
> the comments but I think we can get closer to it than we are. Given
> that a lot of this stuff rests on the denylist perhaps some comments
> just before it's called would be a good place to start?
>

I will add more comments in the code to make it clear.

>> - EL1 interrupt and exception handlers end up in sym_code_ranges[].
>> So, all EL1 interrupt and exception stack traces will be considered
>> unreliable. This the correct behavior as interrupts and exceptions
>
> This stuff about exceptions and preemption is a big one, rejecting any
> exceptions makes a whole host of things easier (eg, Mark Rutland raised
> interactions between non-AAPCS code and PLTs as being an issue but if
> we're able to reliably reject stacks featuring any kind of preemption
> anyway that should sidestep the issue).
>

Yes. I will include this in the code comments.

>> Performance
>> ===========
>
>> Currently, unwinder_blacklisted() does a linear search through
>> sym_code_functions[]. If reviewers prefer, I could sort the
>> sym_code_functions[] array and perform a binary search for better
>> performance. There are about 80 entries in the array.
>
> If people are trying to live patch a very busy/big system then this
> could be an issue, equally there's probably more people focused on
> getting boot times as fast as possible than live patching. Deferring
> the initialisation to first use would help boot times with or without
> sorting, without numbers I don't actually know that sorting is worth the
> effort or needs doing immediately - obvious correctness is also a
> benefit! My instinct is that for now it's probably OK leaving it as a
> linear scan and then revisiting if it's not adequately performant, but
> I'd defer to actual users there.

I have followed the example in the Kprobe deny list. I place the section
in initdata so it can be unloaded during boot. This means that I need to
copy the information before that in early_initcall().

If the initialization must be performed on first use, I probably have to
move SYM_CODE_FUNCTIONS from initdata to some other place where it will
be retained.

If you prefer this, I could do it this way.

Thanks!

Madhavan