Re: [RFC mm][PATCH 2/5] percpu cached mm counter

From: Ingo Molnar
Date: Thu Dec 10 2009 - 02:55:17 EST



* KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:

> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
>
> Now, mm's counter information is updated by atomic_long_xxx()
> functions if USE_SPLIT_PTLOCKS is defined. This causes cache-miss when
> page faults happens simultaneously in prural cpus. (Almost all
> process-shared objects is...)
>
> Considering accounting per-mm page usage more, one of problems is cost
> of this counter.

I'd really like these kinds of stats available via the tool you used to
develop this patchset:

> After:
> Performance counter stats for './multi-fault 2' (5 runs):
>
> 46997471 page-faults ( +- 0.720% )
> 1004100076 cache-references ( +- 0.734% )
> 180959964 cache-misses ( +- 0.374% )
> 29263437363580464 bus-cycles ( +- 0.002% )
>
> 60.003315683 seconds time elapsed ( +- 0.004% )
>
> cachemiss/page faults is reduced from 4.55 miss/faults to be 3.85miss/faults

I.e. why not expose these stats via perf events and counts as well,
beyond the current (rather minimal) set of MM stats perf supports
currently?

That way we'd get a _lot_ of interesting per task mm stats available via
perf stat (and maybe they can be profiled as well via perf record), and
we could perhaps avoid uglies like having to hack hooks into sched.c:

> + /*
> + * sync/invaldidate per-cpu cached mm related information
> + * before taling rq->lock. (see include/linux/mm.h)

(minor typo: s/taling/taking )

> + */
> + sync_mm_counters_atomic();
>
> spin_lock_irq(&rq->lock);
> update_rq_clock(rq);

It's not a simple task i guess since this per mm counting business has
grown its own variant which takes time to rearchitect, plus i'm sure
there's performance issues to solve if such a model is exposed via perf,
but users and developers would be _very_ well served by such
capabilities:

- clean, syscall based API available to monitor tasks, workloads and
CPUs. (or the whole system)

- sampling (profiling)

- tracing, post-process scripting via Perl plugins

etc.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/