Re: [PATCH v3 05/46] perf/x86/intel/cmt: add per-package locks

From: Thomas Gleixner
Date: Thu Nov 10 2016 - 16:26:29 EST


On Sat, 29 Oct 2016, David Carrillo-Cisneros wrote:
> Lockdep needs lock_class_key's to be statically initialized and/or use
> nesting, but nesting is currently hard-coded for up to 8 levels and it's
> fragile to depend on lockdep internals.
> To circumvent this problem, statically define CMT_MAX_NR_PKGS number of
> lock_class_key's.

That's a proper circumvention. Define a random number on your own.

> Additional details in code's comments.

This is

- pointless. Either there are comments or not. I can see that from
the patch.

- wrong. Because for this crucial detail of it (the lockdep
restriction) there is no single comment in the code.

> Signed-off-by: David Carrillo-Cisneros <davidcc@xxxxxxxxxx>
> ---
> arch/x86/events/intel/cmt.c | 22 ++++++++++++++++++++++
> arch/x86/events/intel/cmt.h | 8 ++++++++
> 2 files changed, 30 insertions(+)
>
> diff --git a/arch/x86/events/intel/cmt.c b/arch/x86/events/intel/cmt.c
> index 267a9ec..f12a06b 100644
> --- a/arch/x86/events/intel/cmt.c
> +++ b/arch/x86/events/intel/cmt.c
> @@ -7,6 +7,14 @@
> #include "cmt.h"
> #include "../perf_event.h"
>
> +/* Increase as needed as Intel CPUs grow. */

As CPUs grow? CMT is a per L3 cache domain, i.e. package, which is
equivivalent to socket today..

> +#define CMT_MAX_NR_PKGS 8

This already breaks on these shiny new 16 socket machines companies are
working on. Not to talk about the 4096 core monstrosities which are shipped
today, which have definitely more than 16 sockets already.

And now the obvious question: How are you going to fix this?

Just by increasing the number? See below.

> +#ifdef CONFIG_LOCKDEP
> +static struct lock_class_key mutex_keys[CMT_MAX_NR_PKGS];
> +static struct lock_class_key lock_keys[CMT_MAX_NR_PKGS];
> +#endif
> +
> static DEFINE_MUTEX(cmt_mutex);

> diff --git a/arch/x86/events/intel/cmt.h b/arch/x86/events/intel/cmt.h
> index 8c16797..55416db 100644
> --- a/arch/x86/events/intel/cmt.h
> +++ b/arch/x86/events/intel/cmt.h
> @@ -11,11 +11,16 @@
> * Rules:
> * - cmt_mutex: Hold for CMT init/terminate, event init/terminate,
> * cgroup start/stop.
> + * - Hold pkg->mutex and pkg->lock in _all_ active packages to traverse or
> + * change the monr hierarchy.

So if you hold both locks in 8 packages then you already have 16 locks held
nested. Now double that with 16 packages and you are slowly approaching the
maximum lock chain depth lockdep can cope with. Now think SGI and guess
how that works.

I certainly can understand why you want to split the locks, but my gut
feeling tells me that while it's probably not a big issue on a 2/4 socket
machines it will go down the drain rather fast on anything real big.

Thanks,

tglx