RE: [PATCH RFC 1/4] lib/radix: add universal radix_tree_fill_range

From: Matthew Wilcox
Date: Fri Sep 02 2016 - 14:14:43 EST


I have a rewrite of the iterators; would you like to take a look?

http://git.infradead.org/users/willy/linux-dax.git/shortlog/refs/heads/idr-2016-09-02

There's five distinct sets of changes in that tree:

1. Test suite enhancements (first 8 patches)
2. Split/Join (patches 9-11)
3. Misc cleanups (patches 12-16)
4. Iterator rewrite (patches 17-19)
5. IDR rewrite (patches 20-25)

I could rebase the cleanups & iterator rewrite on top of Linus' tree if we don't want to get the split/join functionality into 4.9.

-----Original Message-----
From: Konstantin Khlebnikov [mailto:koct9i@xxxxxxxxx]
Sent: Thursday, September 1, 2016 2:12 AM
To: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
Cc: Dan Williams <dan.j.williams@xxxxxxxxx>; Matthew Wilcox <mawilcox@xxxxxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx
Subject: Re: [PATCH RFC 1/4] lib/radix: add universal radix_tree_fill_range

On Wed, Aug 31, 2016 at 1:53 AM, Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx> wrote:
> On Tue, Aug 30, 2016 at 03:21:24PM -0700, Dan Williams wrote:
>> On Tue, Aug 30, 2016 at 3:03 PM, Ross Zwisler
>> <ross.zwisler@xxxxxxxxxxxxxxx> wrote:
>> > On Tue, Aug 30, 2016 at 02:56:17PM -0700, Dan Williams wrote:
>> >> On Mon, Aug 29, 2016 at 11:52 AM, Matthew Wilcox <mawilcox@xxxxxxxxxxxxx> wrote:
>> >> > It may be protected by the mapping lock in the current code, but I would it expect it to become an RCU lookup + lock eventually. No mapping lock, just like the page cache.
>> >> >
>> >> > Even if we can work around it, why do we want to? What's the compelling reason to change from the current radix tree representation of order-N entries to an arbitrary range? There are no in-kernel users right now; is there a performance reason to change? We don't usually change an API in anticipation of future users appearing, particularly when the API makes it harder for the existing users to use it.
>> >>
>> >> I'd use a fill range api for the radix backing get_dev_pagemap()
>> >> and potentially another use in device-dax. It centralizes the
>> >> common routine of breaking down a range into its constituent
>> >> power-of-2 ranges.
>> >
>> > Does your usage not work with the current sibling & canonical entry model?
>>
>> It does, but I find myself writing code to walk a range and determine
>> the order of each entry as I insert them. I can see other users
>> needing the same sort of insert helper and the aspect I like of
>> Konstantin's proposed change is that the functionality is part of the
>> core implementation rather than left to be duplicated in each user.
>
> Perhaps the answer is to have them both? Matthew's multi-order radix
> functionality with siblings for those of us that really *want* a single
> canonical entry that we can look up, use tags on, etc. And Konstantin's
> method where we insert a bunch of duplicate entries that don't have
> sibling pointers? Is there a reason why they can't coexist?

I'm not all against "sibling" entries, I just don't want to mess them into iterator and common lookup routines. This is redundant.

Actually it's very easy to integrate similar "sibling" entries into my filling function.

That will be yet another flag which tells to assign given entry only to the first slot and fill following tail with reference to that first slot. Just a pointer with both lower bits set to distinguish it from exceptional and internal pointers.

I think it's better to call them "indirect" entries because this will work for arbitrary ranges too where they are not siblings at all and may be located in several nodes.