Re: [PATCH 3/7] sysfs: Keep an nlink count on sysfs directories.

From: Eric W. Biederman
Date: Mon Jan 11 2010 - 20:03:06 EST


Tejun Heo <tj@xxxxxxxxxx> writes:

> Hello,
>
> On 01/12/2010 05:21 AM, Eric W. Biederman wrote:
>> On large directories sysfs_count_nlinks can be a significant
>> bottleneck, so keep a count in sysfs_dirent.
>
> I was about to suggest changing s_flags to ushort too. Hmmm... adding
> a new field to sysfs_dirent somewhat worries me but this doesn't add
> to the size of the structure. How significant bottlenect are we
> talking about?

It was seen in measurements of sysfs before my last round of changes,
which cause us to refresh the inode, and call sysfs_count_nlink more
often.

I am surprised no one has complained about 2.6.33-rcN yet and reported
a performance regression.

Ultimately not having a cached nlink count transforms what should
be constant time operations to operations that run in time O(N).

>> If we exceed the maximum number of directory entries we can store
>> return nlink of 1. An nlink of 1 matches what reiserfs does in this
>> case, and it let's find and similar utlities know that we have a the
>> directory nlink can not be used for optimization purposes.
>
> Hmmm... what's the limit on reiserfs? Is it 64k too?

The resierfs limit is a bit short of a 32bit number. Ext[234]
all have a 16bit nlink field, and they fail the operation
when you attempt to increment nlink past their limit.

In this case the comparison with reiserfs is to show that at some
point throwing up our hands and not counting and just returning nlink
1 is something userspace can occassionally expect to see. It is
common enough that find has handled this idiom for years.

Since we can handle this without increasing the size of the sysfs_dirent
I figure we should have a good quality of implementation for the common
case and return something userspace can deal with for the extreme cases.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/