Re: Getting empty callchain from perf_callchain_kernel()

From: Josh Poimboeuf
Date: Thu May 23 2019 - 11:27:19 EST


On Thu, May 23, 2019 at 10:50:24PM +0800, Kairui Song wrote:
> > > Hi Josh, this still won't fix the problem.
> > >
> > > Problem is not (or not only) with ___bpf_prog_run, what actually went
> > > wrong is with the JITed bpf code.
> >
> > There seem to be a bunch of issues. My patch at least fixes the failing
> > selftest reported by Alexei for ORC.
> >
> > How can I recreate your issue?
>
> Hmm, I used bcc's example to attach bpf to trace point, and with that
> fix stack trace is still invalid.
>
> CMD I used with bcc:
> python3 ./tools/stackcount.py t:sched:sched_fork

I've had problems in the past getting bcc to build, so I was hoping it
was reproducible with a standalone selftest.

> And I just had another try applying your patch, self test is also failing.

Is it the same selftest reported by Alexei?

test_stacktrace_map:FAIL:compare_map_keys stackid_hmap vs. stackmap err -1 errno 2

> I'm applying on my local master branch, a few days older than
> upstream, I can update and try again, am I missing anything?

The above patch had some issues, so with some configs you might see an
objtool warning for ___bpf_prog_run(), in which case the patch doesn't
fix the test_stacktrace_map selftest.

Here's the latest version which should fix it in all cases (based on
tip/master):

https://git.kernel.org/pub/scm/linux/kernel/git/jpoimboe/linux.git/commit/?h=bpf-orc-fix

> > > For frame pointer unwinder, it seems the JITed bpf code will have a
> > > shifted "BP" register? (arch/x86/net/bpf_jit_comp.c:217), so if we can
> > > unshift it properly then it will work.
> >
> > Yeah, that looks like a frame pointer bug in emit_prologue().
> >
> > > I tried below code, and problem is fixed (only for frame pointer
> > > unwinder though). Need to find a better way to detect and do any
> > > similar trick for bpf part, if this is a feasible way to fix it:
> > >
> > > diff --git a/arch/x86/kernel/unwind_frame.c b/arch/x86/kernel/unwind_frame.c
> > > index 9b9fd4826e7a..2c0fa2aaa7e4 100644
> > > --- a/arch/x86/kernel/unwind_frame.c
> > > +++ b/arch/x86/kernel/unwind_frame.c
> > > @@ -330,8 +330,17 @@ bool unwind_next_frame(struct unwind_state *state)
> > > }
> > >
> > > /* Move to the next frame if it's safe: */
> > > - if (!update_stack_state(state, next_bp))
> > > - goto bad_address;
> > > + if (!update_stack_state(state, next_bp)) {
> > > + // Try again with shifted BP
> > > + state->bp += 5; // see AUX_STACK_SPACE
> > > + next_bp = (unsigned long
> > > *)READ_ONCE_TASK_STACK(state->task, *state->bp);
> > > + // Clean and refetch stack info, it's marked as error outed
> > > + state->stack_mask = 0;
> > > + get_stack_info(next_bp, state->task,
> > > &state->stack_info, &state->stack_mask);
> > > + if (!update_stack_state(state, next_bp)) {
> > > + goto bad_address;
> > > + }
> > > + }
> > >
> > > return true;
> >
> > Nack.
> >
> > > For ORC unwinder, I think the unwinder can't find any info about the
> > > JITed part. Maybe if can let it just skip the JITed part and go to
> > > kernel context, then should be good enough.
> >
> > If it's starting from a fake pt_regs then that's going to be a
> > challenge.
> >
> > Will the JIT code always have the same stack layout? If so then we
> > could hard code that knowledge in ORC. Or even better, create a generic
> > interface for ORC to query the creator of the generated code about the
> > stack layout.
>
> I think yes.
>
> Not sure why we have the BP shift yet, if the prolog code could be
> tweaked to work with frame pointer unwinder it will be good to have.
> But still not for ORC.
>
> Will it be a good idea to have a region reserved for the JITed code?
> Currently it shares the region with "module mapping space". If let it
> have a separate region, when the unwinder meet any code in that region
> it will know it's JITed code and then can do something special about
> it.
>
> This should make it much easier for both frame pointer and ORC unwinder to work.

There's no need to put special cases in the FP unwinder when we can
instead just fix the frame pointer usage in the JIT code.

For ORC, I'm thinking we may be able to just require that all generated
code (BPF and others) always use frame pointers. Then when ORC doesn't
recognize a code address, it could try using the frame pointer as a
fallback.

Once I get a reproducer I can do the patches for all that.

--
Josh