Re: next-20090310: ext4 hangs

From: Jan Kara
Date: Wed Mar 25 2009 - 12:16:19 EST


On Wed 25-03-09 18:29:10, Alexander Beregalov wrote:
> 2009/3/25 Jan Kara <jack@xxxxxxx>:
> > On Wed 25-03-09 18:18:43, Alexander Beregalov wrote:
> >> 2009/3/25 Jan Kara <jack@xxxxxxx>:
> >> >> > So, I think I need to try it on 2.6.29-rc7 again.
> >> >>   I've looked into this. Obviously, what's happenning is that we delete
> >> >> an inode and jbd2_journal_release_jbd_inode() finds inode is just under
> >> >> writeout in transaction commit and thus it waits. But it gets never woken
> >> >> up and because it has a handle from the transaction, every one eventually
> >> >> blocks on waiting for a transaction to finish.
> >> >>   But I don't really see how that can happen. The code is really
> >> >> straightforward and everything happens under j_list_lock... Strange.
> >> >  BTW: Is the system SMP?
> >> No, it is UP system.
> >  Even stranger. And do you have CONFIG_PREEMPT set?
> >
> >> The bug exists even in 2.6.29, I posted it with a new topic.
> >  OK, I've sort-of expected this.
>
> CONFIG_PREEMPT_RCU=y
> CONFIG_PREEMPT_RCU_TRACE=y
> # CONFIG_PREEMPT_NONE is not set
> # CONFIG_PREEMPT_VOLUNTARY is not set
> CONFIG_PREEMPT=y
> CONFIG_DEBUG_PREEMPT=y
> # CONFIG_PREEMPT_TRACER is not set
>
> config is attached.
Thanks for the data. I still don't see how the wakeup can get lost. The
process even cannot be preempted when we are in the section protected by
j_list_lock... Can you send me a disassembly of functions
jbd2_journal_release_jbd_inode() and journal_submit_data_buffers() so that
I can see whether the compiler has not reordered something unexpectedly?
Thanks.

Honza
--
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/