Re: Livelock with the shmctl04 test program from linux test project

From: Manfred Spraul
Date: Thu Oct 28 2004 - 12:27:46 EST


Alexander Nyberg wrote:

Sorry for late reply, but I just can't understand why & how this happens, been trying to grasp
the IPC/SHM part but I'm missing something. One processor gets locked up and never released.


Ok - that's a deadlock.

I did:

printk("taking lock\n"); spin_lock(&info->lock);
printk("lock taken\n");

and it never prints out "lock taken" so i know where it locks up. Now the fun part,
spinlock debugging doesn't catch it,

That's not surprising: the full debug code is only active for uniprocessor kernels. On SMP, only a simple check for unitialized spinlocks is performed.
Btw, I'd use
printk("thread %d, struct %p: taking lock", current->pid, info);
Then you are certain that you are not fooled by multiple concurrent operations.

but I did a simple patch to show who is holding a lock
at the current time, and it appears noone has taken the lock. I really don't get this.



I must think about it. Who's printed as the last owner that released the lock? Perhaps there is a race with segment destruction: The structures are protected by RCU.

Could you enable debug spinlocks and slab debugging? I would have expected an error message from spinlock debugging due to bad magic.

--
Manfred
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/