Re: [PATCH 1/2] kernfs: dont take i_lock on inode attr read

From: Imran Khan
Date: Thu Jul 27 2023 - 01:38:59 EST


Hello again Ian,
I take back my previous comment :).

On 27/7/2023 2:30 pm, Imran Khan wrote:
> Hello Ian,
> Sorry for late reply. I was about to reply this week.
>
> On 27/7/2023 10:38 am, Ian Kent wrote:
>> On 20/7/23 10:03, Ian Kent wrote:
>>> On Wed, 2023-07-19 at 12:23 +0800, Ian Kent wrote:
>
> [...]
>>> I do see a problem with recent changes.
>>>
>>> I'll send this off to Greg after I've done some testing (primarily just
>>> compile and function).
>>>
>>> Here's a patch which describes what I found.
>>>
>>> Comments are, of course, welcome, ;)
>>
>> Anders I was hoping you would check if/what lockdep trace
>>
>> you get with this patch.
>>
>>
>> Imran, I was hoping you would comment on my change as it
>>
>> relates to the kernfs_iattr_rwsem changes.
>>
>>
>> Ian
>>
>>>
>>> kernfs: fix missing kernfs_iattr_rwsem locking
>>>
>>> From: Ian Kent <raven@xxxxxxxxxx>
>>>
>>> When the kernfs_iattr_rwsem was introduced a case was missed.
>>>
>>> The update of the kernfs directory node child count was also protected
>>> by the kernfs_rwsem and needs to be included in the change so that the
>>> child count (and so the inode n_link attribute) does not change while
>>> holding the rwsem for read.
>>>
>
> kernfs direcytory node's child count changes in kernfs_(un)link_sibling and
> these are getting invoked while adding (kernfs_add_one),
> removing(__kernfs_remove) or moving (kernfs_rename_ns)a node. Each of these
> operations proceed under kernfs_rwsem and I see each invocation of
> kernfs_link/unlink_sibling during the above mentioned operations is happening
> under kernfs_rwsem.
> So the child count should still be protected by kernfs_rwsem and we should not
> need to acquire kernfs_iattr_rwsem in kernfs_link/unlink_sibling.
>

kernfs_refresh_inode can still race against kernfs_link/unlink_siblings. So your
change looks good to me.
My tests are not showing any issues either. ( I tested on 4.14 and 5.4 kernels
as well).

Fee free to add my RB.

Reviewed-by: Imran Khan <imran.f.khan@xxxxxxxxxx>

> Kindly let me know your thoughts. I would still like to see new lockdep traces
> with this change.
>
> Thanks,
> Imran
>
>>> Fixes: 9caf696142 (kernfs: Introduce separate rwsem to protect inode
>>> attributes)
>>>
>>> Signed-off-by: Ian Kent <raven@xxxxxxxxxx>
>>> Cc: Anders Roxell <anders.roxell@xxxxxxxxxx>
>>> Cc: Imran Khan <imran.f.khan@xxxxxxxxxx>
>>> Cc: Arnd Bergmann <arnd@xxxxxxxx>
>>> Cc: Minchan Kim <minchan@xxxxxxxxxx>
>>> Cc: Eric Sandeen <sandeen@xxxxxxxxxxx>
>>> ---
>>>   fs/kernfs/dir.c |    4 ++++
>>>   1 file changed, 4 insertions(+)
>>>
>>> diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
>>> index 45b6919903e6..6e84bb69602e 100644
>>> --- a/fs/kernfs/dir.c
>>> +++ b/fs/kernfs/dir.c
>>> @@ -383,9 +383,11 @@ static int kernfs_link_sibling(struct kernfs_node
>>> *kn)
>>>       rb_insert_color(&kn->rb, &kn->parent->dir.children);
>>>         /* successfully added, account subdir number */
>>> +    down_write(&kernfs_root(kn)->kernfs_iattr_rwsem);
>>>       if (kernfs_type(kn) == KERNFS_DIR)
>>>           kn->parent->dir.subdirs++;
>>>       kernfs_inc_rev(kn->parent);
>>> +    up_write(&kernfs_root(kn)->kernfs_iattr_rwsem);
>>>         return 0;
>>>   }
>>> @@ -408,9 +410,11 @@ static bool kernfs_unlink_sibling(struct
>>> kernfs_node *kn)
>>>       if (RB_EMPTY_NODE(&kn->rb))
>>>           return false;
>>>   +    down_write(&kernfs_root(kn)->kernfs_iattr_rwsem);
>>>       if (kernfs_type(kn) == KERNFS_DIR)
>>>           kn->parent->dir.subdirs--;
>>>       kernfs_inc_rev(kn->parent);
>>> +    up_write(&kernfs_root(kn)->kernfs_iattr_rwsem);
>>>         rb_erase(&kn->rb, &kn->parent->dir.children);
>>>       RB_CLEAR_NODE(&kn->rb);
>>>