Re: [PATCH 1/4] fs/dcache: Limit numbers of negative dentries

From: Waiman Long
Date: Mon Jul 17 2017 - 14:31:12 EST


On 07/17/2017 01:49 PM, Matthew Wilcox wrote:
> On Mon, Jul 17, 2017 at 09:39:30AM -0400, Waiman Long wrote:
>> The number of positive dentries is limited by the number of files
>> in the filesystems. The number of negative dentries, however,
>> has no limit other than the total amount of memory available in
>> the system. So a rogue application that generates a lot of negative
>> dentries can potentially exhaust most of the memory available in the
>> system impacting performance on other running applications.
>>
>> To prevent this from happening, the dcache code is now updated to limit
>> the amount of the negative dentries in the LRU lists that can be kept
>> as a percentage of total available system memory. The default is 5%
>> and can be changed by specifying the "neg_dentry_pc=" kernel command
>> line option.
> I see the problem, but rather than restricting the number of negative
> dentries to be a fraction of the total amount of memory in the machine,
> wouldn't it make more sense to limit the number of negative dentries to be
> some multiple of the number of positive dentries currently in the system?

The number of positive dentries will be a rapidly changing number. So we
can't use __read_mostly variable for the limits. That may have a certain
performance impact. I chose to use a fixed number because of simplicity
and performance. I can compromise on simplicity, but not on performance.
I am open to maybe adjust the free pool count in some ways as long as
the performance impact is negligible.

> Or make negative dentries more easily prunable. For example, we could
> allocate them from a separate slab and use the existing reclaim mechanism
> to just throw them away. Since they can't be pinned by an inode, they're
> much easier to get rid of than positive dentries. Might make changing
> a dentry from positive to negative or vice versa a bit more expensive ...

I don't quite understand what you mean by having two separate slabs. The
current reclaim mechanism is through scanning the LRU lists.

I had been thinking about having a separate LRU list for negative
dentries. Giving the complexity of the current per-node/per-memcg LRU
list, maintaining 2 separate LRU lists in each super_block may be
error-prone.

It is true that positive dentries will also be pruned in the process. By
the time automatic pruning happens, there should have a lot of negative
dentries in the LRU lists already. We can skip over positive dentries in
the scanning, but we have to either allow scanning more entries in each
pass prolonging the interruption or do no pruning at all if the LRU
lists are front-loaded with a bunch of positive dentries.

BTW, you remind me that I should have accounted for the
positive-to-negative dentry transitions which is missing in the current
patch.

Cheers,
Longman