Re: [PATCH v3 03/22] perf dso: Make lock error check and add BUG_ONs

From: Arnaldo Carvalho de Melo
Date: Fri Feb 11 2022 - 14:21:18 EST


Em Fri, Feb 11, 2022 at 09:43:19AM -0800, Ian Rogers escreveu:
> On Fri, Feb 11, 2022 at 9:13 AM Arnaldo Carvalho de Melo
> <acme@xxxxxxxxxx> wrote:
> >
> > Em Fri, Feb 11, 2022 at 02:33:56AM -0800, Ian Rogers escreveu:
> > > Make the pthread mutex on dso use the error check type. This allows
> > > deadlock checking via the return type. Assert the returned value from
> > > mutex lock is always 0.
> >
> > I think this is too blunt/pervasive source code wise, perhaps we should
> > wrap this like its done with rwsem in tools/perf/util/rwsem.h to get
> > away from pthreads primitives and make the source code look more like
> > a kernel one and then, taking advantage of the so far ideologic
> > needless indirection, add this BUG_ON if we build with "DEBUG=1" or
> > something, wdyt?
>

> My concern with semaphores is that they are a concurrency primitive

I'm not suggesting we switch over to semaphores, just to use the same
technique of wrapping pthread_mutex_t with some other API that then
allows us to add these BUG_ON() calls without polluting the source code
in many places.

- Arnaldo

> that has more flexibility and power than a mutex. I like a mutex as it
> is quite obvious what is going on and that is good from a tooling
> point of view. A deadlock with two mutexes is easy to understand. On a
> semaphore, were we using it like a condition variable? There's more to
> figure out. I also like the idea of compiling the perf command with
> emscripten, we could then generate say perf annotate output in your
> web browser. Emscripten has implementations of standard posix
> libraries including pthreads, but we may need to have two approaches
> in the perf code if we want to compile with emscripten and use
> semaphores when targeting linux.
>
> Where this change comes from is that I worried that extending the
> locked regions to cover the race that'd been found would then expose
> the kind of recursive deadlock that pthread mutexes all too willingly
> allow. With this code we at least see the bug and don't just hang. I
> don't think we need the change to the mutexes for this change, but we
> do need to extend the regions to fix the data race.
>
> Let me know how you prefer it and I can roll it into a v4 version.
>
> Thanks,
> Ian
>
> > - Arnaldo
> >
> > > Signed-off-by: Ian Rogers <irogers@xxxxxxxxxx>
> > > ---
> > > tools/perf/util/dso.c | 12 +++++++++---
> > > tools/perf/util/symbol.c | 2 +-
> > > 2 files changed, 10 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
> > > index 9cc8a1772b4b..6beccffeef7b 100644
> > > --- a/tools/perf/util/dso.c
> > > +++ b/tools/perf/util/dso.c
> > > @@ -784,7 +784,7 @@ dso_cache__free(struct dso *dso)
> > > struct rb_root *root = &dso->data.cache;
> > > struct rb_node *next = rb_first(root);
> > >
> > > - pthread_mutex_lock(&dso->lock);
> > > + BUG_ON(pthread_mutex_lock(&dso->lock) != 0);
> > > while (next) {
> > > struct dso_cache *cache;
> > >
> > > @@ -830,7 +830,7 @@ dso_cache__insert(struct dso *dso, struct dso_cache *new)
> > > struct dso_cache *cache;
> > > u64 offset = new->offset;
> > >
> > > - pthread_mutex_lock(&dso->lock);
> > > + BUG_ON(pthread_mutex_lock(&dso->lock) != 0);
> > > while (*p != NULL) {
> > > u64 end;
> > >
> > > @@ -1259,6 +1259,8 @@ struct dso *dso__new_id(const char *name, struct dso_id *id)
> > > struct dso *dso = calloc(1, sizeof(*dso) + strlen(name) + 1);
> > >
> > > if (dso != NULL) {
> > > + pthread_mutexattr_t lock_attr;
> > > +
> > > strcpy(dso->name, name);
> > > if (id)
> > > dso->id = *id;
> > > @@ -1286,8 +1288,12 @@ struct dso *dso__new_id(const char *name, struct dso_id *id)
> > > dso->root = NULL;
> > > INIT_LIST_HEAD(&dso->node);
> > > INIT_LIST_HEAD(&dso->data.open_entry);
> > > - pthread_mutex_init(&dso->lock, NULL);
> > > + pthread_mutexattr_init(&lock_attr);
> > > + pthread_mutexattr_settype(&lock_attr, PTHREAD_MUTEX_ERRORCHECK);
> > > + pthread_mutex_init(&dso->lock, &lock_attr);
> > > + pthread_mutexattr_destroy(&lock_attr);
> > > refcount_set(&dso->refcnt, 1);
> > > +
> > > }
> > >
> > > return dso;
> > > diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> > > index b2ed3140a1fa..43f47532696f 100644
> > > --- a/tools/perf/util/symbol.c
> > > +++ b/tools/perf/util/symbol.c
> > > @@ -1783,7 +1783,7 @@ int dso__load(struct dso *dso, struct map *map)
> > > }
> > >
> > > nsinfo__mountns_enter(dso->nsinfo, &nsc);
> > > - pthread_mutex_lock(&dso->lock);
> > > + BUG_ON(pthread_mutex_lock(&dso->lock) != 0);
> > >
> > > /* check again under the dso->lock */
> > > if (dso__loaded(dso)) {
> > > --
> > > 2.35.1.265.g69c8d7142f-goog
> >
> > --
> >
> > - Arnaldo

--

- Arnaldo