Re: INFO: task hung in __sb_start_write

From: Tetsuo Handa
Date: Thu Jun 14 2018 - 06:35:21 EST


On 2018/06/11 16:39, Dmitry Vyukov wrote:
> On Mon, Jun 11, 2018 at 9:30 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>> On Sun, Jun 10, 2018 at 11:47:56PM +0900, Tetsuo Handa wrote:
>>
>>> This looks quite strange that nobody is holding percpu_rw_semaphore for
>>> write but everybody is stuck trying to hold it for read. (Since there
>>> is no "X locks held by ..." line without followup "#0:" line, there is
>>> no possibility that somebody is in TASK_RUNNING state while holding
>>> percpu_rw_semaphore for write.)
>>>
>>> I feel that either API has a bug or API usage is wrong.
>>> Any idea for debugging this?
>>
>> Look at percpu_rwsem_release() and usage. The whole fs freezer thing is
>> magic.
>
> Do you mean that we froze fs? We tried to never-ever issue
> ioctl(FIFREEZE) during fuzzing. Are there other ways to do this?
>

Dmitry, can you try this patch? If you can get

[ 48.080875] ================================================
[ 48.083648] WARNING: lock held when returning to user space!
[ 48.086384] 4.17.0+ #588 Tainted: G T
[ 48.088890] ------------------------------------------------
[ 48.091447] a.out/1243 is leaving the kernel with locks still held!
[ 48.093487] 3 locks held by a.out/1243:
[ 48.094964] #0: 00000000148ae74c (sb_writers#8){++++}, at: percpu_down_write+0x1d/0x110
[ 48.097622] #1: 000000001c9e7d4d (sb_pagefaults){++++}, at: percpu_down_write+0x1d/0x110
[ 48.100432] #2: 000000003c3d2e71 (sb_internal){++++}, at: percpu_down_write+0x1d/0x110

with this patch, there is a way to return to userspace with locks held.
If you got possible deadlock warning messages, it will be great.

If you cannot reproduce with this patch, I think we need a git tree to try
this patch. But linux-next.git is not yet re-added to the list of trees to
test and linux.git is not suitable for temporary debug patch...