Re: [PATCH] proc: faster /proc/*/status

From: Alexey Dobriyan
Date: Sun Aug 07 2016 - 04:53:32 EST


On Sat, Aug 06, 2016 at 08:16:27PM -0700, Andi Kleen wrote:
> Alexey Dobriyan <adobriyan@xxxxxxxxx> writes:
> > -
> > + seq_printf(m, "State:\t%s", get_task_state(p));
> > +
> > + seq_puts(m, "\nTgid:\t");
>
> The only different should be the format string.
>
> Scanning the format string really shouldn't be that expensive?!?

Surprise, it is (see my reply to Al).

What seq_put_decimal_ull() did is the equivalent of

seq << "foo";
seq << bar;
seq << '\n';

No precisions, not widths, no padding, no upper and lowercasing.

> It would be better if you could find out why that is slow and optimize
> it. Then you would benefit every seq_printf user, not just this
> special case.
>
> Perhaps it could benefit from some of the bit masking tricks to
> scan the string with wider tests than a word.

And then what? Parsing format string is still be there.

This is first line of profile of the first function (format_decode)

â static noinline_for_stack
â int format_decode(const char *fmt, struct printf_spec *spec)
â {
10.38 â push %rbp <===
1.07 â mov %rsp,%rbp
1.09 â push %r12
4.51 â mov %rsi,%r12
1.40 â push %rbx
1.86 â mov %rdi,%rbx
â sub $0x8,%rsp

It is so bloated that gcc needs to be asked to not screw up with stack
size.