Re: [patch 15/14] x86/dumpstack/64: Speedup in_exception_stack()

From: Thomas Gleixner
Date: Tue Apr 02 2019 - 11:49:03 EST


On Tue, 2 Apr 2019, Josh Poimboeuf wrote:
> On Tue, Apr 02, 2019 at 12:19:46PM +0200, Thomas Gleixner wrote:
> > +/*
> > + * Array of exception stack page descriptors. If the stack is larger than
> > + * PAGE_SIZE, all pages covering a particular stack will have the same
> > + * info.
> > + */
> > +static const struct estack_pages estack_pages[ESTACK_PAGES] ____cacheline_aligned = {
> > + [CONDRANGE(DF)] = ESTACK_PAGE(DOUBLEFAULT_IST, DF),
> > + [CONDRANGE(NMI)] = ESTACK_PAGE(NMI_IST, NMI),
> > + [PAGERANGE(DB)] = ESTACK_PAGE(DEBUG_IST, DB),
> > + [CONDRANGE(MCE)] = ESTACK_PAGE(MCE_IST, MCE),
>
> It would be nice if the *_IST macro naming aligned with the struct
> cea_exception_stacks field naming. Then you could just do, e.g.
> ESTACKPAGE(DF).

Yes, lemme fix that up.

> Also it's a bit unfortunate that some of the stack size knowledge is
> hard-coded here, i.e #DB always being > 1 page and non-#DB being
> sometimes 1 page.

The problem is that there is no way to make this macro maze conditional on
sizeof(). But my macro foo is rusty.

> > + begin = (unsigned long)__this_cpu_read(cea_exception_stacks);
> > + end = begin + sizeof(struct cea_exception_stacks);
> > + /* Bail if @stack is outside the exception stack area. */
> > + if (stk <= begin || stk >= end)
> > + return false;
>
> This check is the most important piece. Exception stack dumps are quite
> rare, so this ensures an early exit in most cases regardless of whether
> there's a loop below.
>
> > +
> > + /* Calc page offset from start of exception stacks */
> > + k = (stk - begin) >> PAGE_SHIFT;
> > + /* Lookup the page descriptor */
> > + ep = &estack_pages[k];
> > + /* Guard page? */
> > + if (unlikely(!ep->size))
> > + return false;
> > +
> > + begin += (unsigned long)ep->offs;
> > + end = begin + (unsigned long)ep->size;
> > + regs = (struct pt_regs *)end - 1;
> > +
> > + info->type = ep->type;
> > + info->begin = (unsigned long *)begin;
> > + info->end = (unsigned long *)end;
> > + info->next_sp = (unsigned long *)regs->sp;
> > + return true;
>
> With the above "(stk <= begin || stk >= end)" check, removing the loop
> becomes not all that important since exception stack dumps are quite
> rare and not performance sensitive. With all the macros this code
> becomes a little more obtuse, so I'm not sure whether removal of the
> loop is a net positive.

What about perf? It's NMI context and probably starts from there. Peter?

Thanks,

tglx