Re: [PATCH 3/3] lockdep: prevent chain_key collisions

From: Ingo Molnar
Date: Wed Feb 17 2016 - 03:39:00 EST



* Alfredo Alvarez Fernandez <alfredoalvarezfernandez@xxxxxxxxx> wrote:

> From: Alfredo Alvarez Fernandez <alfredoalvarezfernandez@xxxxxxxxx>
>
> The chain_key hashing macro iterate_chain_key(key1, key2) does not
> generate a new different value if both key1 and key2 are 0. In that
> case the generated value is again 0. This can lead to collisions which
> can result in lockdep not detecting deadlocks or circular
> dependencies.
>
> Avoid the problem by using class_idx (1-based) instead of class id
> (0-based) as an input for the hashing macro 'key2' in
> iterate_chain_key(key1, key2).
>
> The use of class id created collisions in cases like the following:

Nice find!

> 1.- Consider an initial state in which no class has been acquired yet.
> Under these circumstances an AA deadlock will not be detected by
> lockdep:
>
> lock [key1,key2]->new key (key1=old chain_key, key2=id)
> --------------------------
> A [0,0]->0
> A [0,0]->0 (collision)
>
> The newly generated chain_key collides with the one used before and as
> a result the check for a deadlock is skipped
>
> A simple test using liblockdep and a pthread mutex confirms the
> problem: (omitting stack traces)
>
> new class 0xe15038: 0x7ffc64950f20
> acquire class [0xe15038] 0x7ffc64950f20
> acquire class [0xe15038] 0x7ffc64950f20
> hash chain already cached, key: 0000000000000000 tail class:
> [0xe15038] 0x7ffc64950f20
>
> 2.- Consider an ABBA in 2 different tasks and no class yet acquired.
>
> T1 [key1,key2]->new key T2[key1,key2]->new key
> -- --
> A [0,0]->0
>
> B [0,1]->1
>
> B [0,1]->1 (collision)
>
> A
>
> In this case the collision prevents lockdep from creating the new
> dependency A->B. This in turn results in lockdep not detecting the
> circular dependency when T2 acquires A.

So this is pretty fatal result - and it was not found for years!

Could you please also do a follow up fix, to treat hash collisions as hard errors
and shut up lockdep an report the bug?

Thanks,

Ingo