Re: [PATCH RFC 1/4] lib/radix: add universal radix_tree_fill_range

From: Konstantin Khlebnikov
Date: Sat Sep 03 2016 - 00:50:59 EST


On Fri, Sep 2, 2016 at 8:59 PM, Matthew Wilcox <mawilcox@xxxxxxxxxxxxx> wrote:
> I have a rewrite of the iterators; would you like to take a look?
>
> http://git.infradead.org/users/willy/linux-dax.git/shortlog/refs/heads/idr-2016-09-02
>
> There's five distinct sets of changes in that tree:
>
> 1. Test suite enhancements (first 8 patches)
> 2. Split/Join (patches 9-11)
> 3. Misc cleanups (patches 12-16)
> 4. Iterator rewrite (patches 17-19)

Have you compared performance?
There is simple benchmark in my patchset.

> 5. IDR rewrite (patches 20-25)

Why? I don't see reason for that.

>
> I could rebase the cleanups & iterator rewrite on top of Linus' tree if we don't want to get the split/join functionality into 4.9.
>
> -----Original Message-----
> From: Konstantin Khlebnikov [mailto:koct9i@xxxxxxxxx]
> Sent: Thursday, September 1, 2016 2:12 AM
> To: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
> Cc: Dan Williams <dan.j.williams@xxxxxxxxx>; Matthew Wilcox <mawilcox@xxxxxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH RFC 1/4] lib/radix: add universal radix_tree_fill_range
>
> On Wed, Aug 31, 2016 at 1:53 AM, Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx> wrote:
>> On Tue, Aug 30, 2016 at 03:21:24PM -0700, Dan Williams wrote:
>>> On Tue, Aug 30, 2016 at 3:03 PM, Ross Zwisler
>>> <ross.zwisler@xxxxxxxxxxxxxxx> wrote:
>>> > On Tue, Aug 30, 2016 at 02:56:17PM -0700, Dan Williams wrote:
>>> >> On Mon, Aug 29, 2016 at 11:52 AM, Matthew Wilcox <mawilcox@xxxxxxxxxxxxx> wrote:
>>> >> > It may be protected by the mapping lock in the current code, but I would it expect it to become an RCU lookup + lock eventually. No mapping lock, just like the page cache.
>>> >> >
>>> >> > Even if we can work around it, why do we want to? What's the compelling reason to change from the current radix tree representation of order-N entries to an arbitrary range? There are no in-kernel users right now; is there a performance reason to change? We don't usually change an API in anticipation of future users appearing, particularly when the API makes it harder for the existing users to use it.
>>> >>
>>> >> I'd use a fill range api for the radix backing get_dev_pagemap()
>>> >> and potentially another use in device-dax. It centralizes the
>>> >> common routine of breaking down a range into its constituent
>>> >> power-of-2 ranges.
>>> >
>>> > Does your usage not work with the current sibling & canonical entry model?
>>>
>>> It does, but I find myself writing code to walk a range and determine
>>> the order of each entry as I insert them. I can see other users
>>> needing the same sort of insert helper and the aspect I like of
>>> Konstantin's proposed change is that the functionality is part of the
>>> core implementation rather than left to be duplicated in each user.
>>
>> Perhaps the answer is to have them both? Matthew's multi-order radix
>> functionality with siblings for those of us that really *want* a single
>> canonical entry that we can look up, use tags on, etc. And Konstantin's
>> method where we insert a bunch of duplicate entries that don't have
>> sibling pointers? Is there a reason why they can't coexist?
>
> I'm not all against "sibling" entries, I just don't want to mess them into iterator and common lookup routines. This is redundant.
>
> Actually it's very easy to integrate similar "sibling" entries into my filling function.
>
> That will be yet another flag which tells to assign given entry only to the first slot and fill following tail with reference to that first slot. Just a pointer with both lower bits set to distinguish it from exceptional and internal pointers.
>
> I think it's better to call them "indirect" entries because this will work for arbitrary ranges too where they are not siblings at all and may be located in several nodes.