Re: -mm: xfs lockdep warning

From: Tejun Heo
Date: Thu Oct 07 2010 - 11:41:56 EST


Hello,

On 10/07/2010 05:15 PM, Torsten Kaiser wrote:
> :-(
> The second try with the patched 2.6.36-rc6 got stuck again.

:-(

>> repeat the same test w/ 2.6.35 and see whether there's any noticeable
>> difference?
>
> I just tested vanilla 2.6.35, userspace and the testcase as with the
> 2.6.36-rc6 kernel.
> Behavior was the same as with the first try of the patched 2.6.36-rc6:
> The testcase started to build up CPU load and memory pressure, then
> the mouse got stuck. After ~1 min the system recovered and KDE became
> usable again. (Compile just finish OK)
>
> vmstat 60 from 2.6.35:
> 2 0 0 2365848 1060 742072 0 0 205 7 6467 12252 26 5 62 8
> 17 0 0 837936 1060 1079520 0 0 1025 24 2210 13483 69 20 8 2
> 18 0 0 694296 1060 1127728 0 0 231 14 626 1883 74 26 0 0
> 20 0 0 228540 124 998440 0 0 48 12 604 1820 72 28 0 0
> 23 1 24 195456 52 657216 0 0 339 11 563 1669 84 16 0 0
> 0 42 937144 14004 0 164352 0 15617 290 15633 827 1254 24 9 1 66
> 21 1 861184 776104 0 149908 998 3852 2790 3902 1137 2454 45 15 0 40
> 21 0 798544 1131964 0 190004 212 0 530 15 599 1469 87 13 0 0
> 12 1 779488 1536772 0 225312 195 0 398 12 534 1087 88 12 0 0
> 11 0 620620 2201408 0 273976 1338 0 1638 40 851 2281 75 19 3 3
> 12 1 588196 1572564 0 344232 521 0 1326 152 787 1888 86 14 0 0
> 8 0 510876 2286000 0 311464 127 0 442 398 758 1558 88 12 0 0
>
>> Some level of stuttering is expected if the system is hit
>> with sudden huge spike of memory pressure but let's see if it has
>> regressed somehow.
>
> Yes, if a swapstorm occurs, I can live with a (short) "lockup". For
> example compiling openoffice on tmpfs is a case where I have seen
> similar short periods of the mouse getting stuck. But with 2.6.36-rc5
> it was the first time that the system did not seem to recover, so I
> think it should count as a regression compared to 2.6.35.
>
> As the second try with the patched 2.6.36-rc6 got stuck again, could
> it be that your patch is incomplete?
>>From drivers/md/dm-crypt.c:
> cc->io_queue = create_singlethread_workqueue("kcryptd_io");
> cc->crypt_queue = create_singlethread_workqueue("kcryptd");
>
> Do these workqueues also need WQ_MEM_RECLAIM?

No, singlethread wq's are currently mapped to unbound wq's and
automatically HIGHPRI.

> I have attached the SysRq-M output from the patched -rc6. The symptoms
> were the same as with the unpatched kernel. The system got stuck, the
> hung_task_timeout triggered for a numer of programs in sync_page() or
> do_lookup() and one kworker/1:1:496 was probable the nail in the
> coffin.
>
> Do you want me to do more tests with 2.6.35 or the patched .36-rc6? Or
> any other patch I should try?

496 being stuck should be okay. They are allowed to be stuck there
and that's exactly why we have the rescuers. Can you please capture
the output of sysrq-t w/ the patch applied and hung trigger tripped?
I'd really like to know what the rescuers are doing.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/