Re: combinatorial explosion in lockdep

From: David Miller
Date: Fri Aug 01 2008 - 05:32:23 EST


From: Ingo Molnar <mingo@xxxxxxx>
Date: Fri, 1 Aug 2008 11:22:19 +0200

>
> * Ingo Molnar <mingo@xxxxxxx> wrote:
>
> >
> > * David Miller <davem@xxxxxxxxxxxxx> wrote:
> >
> > > lockdep: Fix combinatorial explosion in lock subgraph traversal.
> >
> > applied to tip/core/locking - thanks David. I guess we need to test
> > this a bit, the patch is far from simple :-)
>
> small build fallout fix below.

Thanks.

BTW, until something like Peter's attempt is working, we need to also
scale some of the lockdep limits by NR_CPUS. The formula I came up
with that worked with my 32-cpu, 64-cpu and 128-cpu machines was:

#define __LOCKDEP_NR_CPU_SCALE \
((NR_CPUS <= 16) ? 0 : ilog2(NR_CPUS) - 4)

#define MAX_LOCKDEP_ENTRIES (8192UL << __LOCKDEP_NR_CPU_SCALE)

#define MAX_LOCKDEP_CHAINS_BITS (16 + __LOCKDEP_NR_CPU_SCALE)
#define MAX_LOCKDEP_CHAINS (1UL << MAX_LOCKDEP_CHAINS_BITS)

But this is going to explode for NR_CPUS=4096, but it is the only way
to get a working lockdep currently, due to the runqueue lock classes.

Also, when these limits reached triggered, we get the same printk
wakeup deadlock problem I hit with Peter's patch.

I think a non-trivial number of people hit that printk deadlock bug,
but just didn't report it because the machine essentially hard hangs
silently. At best you'd see the:

========================================

initial line from lockdep, but often even that doesn't make it to the
console.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/