Re: [PATCH] radix_tree: take radix_tree_path off stack

From: Hugh Dickins
Date: Mon Dec 19 2011 - 15:13:55 EST


On Mon, 19 Dec 2011, Nai Xia wrote:
> On Mon, Dec 19, 2011 at 4:20 PM, nai.xia <nai.xia@xxxxxxxxx> wrote:
> >
> > Can rcu_head in someway unionized with radix_tree_node->height
> > and radix_tree_node->count? count is always referenced under lock
> > and only the first node's height is referenced during lookup.
> > Seems like if we atomically set root->rnode to NULL, before
> > freeing the last node, we can ensure a valid read of the
> > radix_tree_node->height when lookup by following it with
> > a root->rnode == NULL test.
> >
> > I am not very sure of course, just a naive feeling.

I think you are right about radix_tree_node->count (only used under
lock, of no interest when lockless), but I'm not so sure about
radix_tree_node->height.

If you're right, then radix_tree_node->height shouldn't be needed
at all, we could work from radix_tree_root->height throughout.
And many places do so, but lockless (rcu locked) ones are relying
on radix_tree_node->height.

Caution tells me that it's intimately a part of the way we can
safely look up locklessly while the tree is being expanded or shrunk:
notice how a node never changes its height. But perhaps it could be
reduced to a flag bit, saying whether node is a leafnode or not.

But I wouldn't risk making any such change without spending time to
think it through, time I must spend on other tasks now. Please play
with it yourself if you've a mind to.

>
> And besides, I think maybe there were another few ways if
> we really care about the stack usage of radix_tree_path,
> e.g.
> 1. We can make radix_tree_path.offset compact to u8
> which is enough to index inside a node.

Well, I could have halved the usage with a patch just recalculating
the offset when ascending; but I'd rather get rid of the whole array.

>
> 2. We can use dynamic array on stack instead of
> RADIX_TREE_MAX_PATH, I think for most cases
> this may save half of the space

tag_if_tagged() was using a dynamic array on stack: that came as a
surprise to me, we usually forbid that in kernel. It becomes hard
to estimate stack usage, so may cause nasty surprises in rare cases.

>
> 3. Take benefit of radix_tree_path array already
> traveled down to clear the tags instead of calling
> a radix_tree_tag_clear with full array.

There must be a more efficient way of doing that, yes; but it's such a
very rare path that it's not worth extra code, I just added the comment
"This way of doing it would be inefficient, but seldom is any set".

>
> I am not speaking of the negatives of your patch
> , just some alternatives for your reference.
>
> And forgive my possible selfishness, I just created a home
> made radix tree traveler based on radix_tree_path array to
> simulate recursive calls, not ready to its vanishing...

I did remove
struct radix_tree_path {
struct radix_tree_node *node;
int offset;
};
but it's easy enough for you to add back if you have good use for it.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/