Re: [PATCH v5 3/3] z3fold: add shrinker

From: Dan Streetman
Date: Tue Oct 18 2016 - 11:30:05 EST


On Tue, Oct 18, 2016 at 10:51 AM, Vitaly Wool <vitalywool@xxxxxxxxx> wrote:
> On Tue, Oct 18, 2016 at 4:27 PM, Dan Streetman <ddstreet@xxxxxxxx> wrote:
>> On Mon, Oct 17, 2016 at 10:45 PM, Vitaly Wool <vitalywool@xxxxxxxxx> wrote:
>>> Hi Dan,
>>>
>>> On Tue, Oct 18, 2016 at 4:06 AM, Dan Streetman <ddstreet@xxxxxxxx> wrote:
>>>> On Sat, Oct 15, 2016 at 8:05 AM, Vitaly Wool <vitalywool@xxxxxxxxx> wrote:
>>>>> This patch implements shrinker for z3fold. This shrinker
>>>>> implementation does not free up any pages directly but it allows
>>>>> for a denser placement of compressed objects which results in
>>>>> less actual pages consumed and higher compression ratio therefore.
>>>>>
>>>>> This update removes z3fold page compaction from the freeing path
>>>>> since we can rely on shrinker to do the job. Also, a new flag
>>>>> UNDER_COMPACTION is introduced to protect against two threads
>>>>> trying to compact the same page.
>>>>
>>>> i'm completely unconvinced that this should be a shrinker. The
>>>> alloc/free paths are much, much better suited to compacting a page
>>>> than a shrinker that must scan through all the unbuddied pages. Why
>>>> not just improve compaction for the alloc/free paths?
>>>
>>> Basically the main reason is performance, I want to avoid compaction on hot
>>> paths as much as possible. This patchset brings both performance and
>>> compression ratio gain, I'm not sure how to achieve that with improving
>>> compaction on alloc/free paths.
>>
>> It seems like a tradeoff of slight improvement in hot paths, for
>> significant decrease in performance by adding a shrinker, which will
>> do a lot of unnecessary scanning. The alloc/free/unmap functions are
>> working directly with the page at exactly the point where compaction
>> is needed - when adding or removing a bud from the page.
>
> I can see that sometimes there are substantial amounts of pages that
> are non-compactable synchronously due to the MIDDLE_CHUNK_MAPPED
> bit set. Picking up those seems to be a good job for a shrinker, and those
> end up in the beginning of respective unbuddied lists, so the shrinker is set
> to find them. I can slightly optimize that by introducing a
> COMPACT_DEFERRED flag or something like that to make shrinker find
> those pages faster, would that make sense to you?

Why not just compact the page in z3fold_unmap()?

>
>> Sorry if I missed it in earlier emails, but have you done any
>> performance measurements comparing with/without the shrinker? The
>> compression ratio gains may be possible with only the
>> z3fold_compact_page() improvements, and performance may be stable (or
>> better) with only a per-z3fold-page lock, instead of adding the
>> shrinker...?
>
> I'm running some tests with per-page locks now, but according to the
> previous measurements the shrinker version always wins on multi-core
> platforms.

But that comparison is without taking the spinlock in map/unmap right?

>
>> If a shrinker really is needed, it seems like it would be better
>> suited to coalescing separate z3fold pages via migration, like
>> zsmalloc does (although that's a significant amount of work).
>
> I really don't want to go that way to keep z3fold applicable to an MMU-less
> system.
>
> ~vitaly