About me and about btree_blue

From: liuwf
Date: Wed May 03 2023 - 15:00:10 EST



With several days of upgrading since btree_blue's first RFC posted in kernel
mailling list, currently btree_blue can be observed 40% or more faster than
the original lib/btree in random inserts/deletes, and 10x (1000%) faster than
rbtree when traversing a 1M keys in a tree.

Basically, btree_blue under linux run 2xx ms or even 1xx ms per 1M operations,
this rate may be prudently comparing with Google's btree, the later is single-
thread btree(fixme) at a rate level of 1xx ~ 2xx ms per 1M operations also.

I'm agreed wtih Linus's comments on my RFC, multi-thread frendly trees are
more important than signle-thread one in high-perf required cases. Besides,
I'm also afraid that btree_blue is still needed to be faster when used in some
cases - even if those cases may not be easily replaced by multi-thread trees.

Imaging one is waiting to check-out at the gate after selected his goods,
there are a dozen of cashing channels in the mall - that's fine, and, even
though there is no one else queued in any of those cashing channels except
himself, a question still exists for him: how much time is used to service him
singly ? normally we are seviced within 3 or 5 minutes, if the time is 15
minutes or more our shopping experience may be different - even if there are
a dozen of cashing channels.

The perf rise in my patchs are not based on a group of stamp-collectors. In
today's linux kernel community, it is a challenge for almost any twenty percent
advances, because any present codes we can see (btree, rbtree. etc.) are
excellent enough. In fact, the only reason for me to choose the name
btree_blue is that I'm quite concerned my patch's perf if it is slower than
original one after I added several features to it, I don't think anyone will
be happy to accept or give a review for a mediocre patch, so I decide to give
the temporary name to it rather than call it lib/btree's patch before post it
out. But I still think it is good for me to call it as btree's patches even if
it's relatively fast and have more features, because that presents a faith -
every people offer his idea to make things better - the Linux spirit.

Two months before I posted a patch for btrfs after I decided to give some in-
depth studies to the file system again, which is one of my favourit.That patch
targets to one of two code-paths in btrfs free space sub-system - a less normal
code-path, only run in a diffiuclt condition when the level of free space is
lower. I am glad that I used a simple trick to rise the path's perf more than
200% from fstest result, but the patch has not been reviewed too. After that
patch I found there are several places in another code-path - the mainline
code-path of btrfs free space sub-system - may be given some improvments.

In fact, a known developer of btrfs has also noticed problems and offered a set
of patchs to fix them several years ago. If I'm right, that set of patchs were
not applied finally, and this is a bit beyond me. I searched all btrfs mailling
list and found no info about their discussions on the thing, I guess they
discussed in other occasions and dicided they had no needs for that set of
patchs. But I still have some puzzle for the issue, and I think personally it
maybe better if another data structure is used in those cases. so I try to get
a btree-like thing to estimate the probability of applying in btrfs. I viewed
lib/btree and felt only one tiny pity for it - there is lack of a linear
travesal in it(fixme). The good thing is that, Joern Engel's btree is elegant
and fast, so I decided to add several features based on it. As my said in
previous patchs I found it is a challenge to keep the same perf and effective
with the original one when features added, so I have to do several optimizings
on it - that the btree's patch come from.

In next days I think I need some time to add several more APIs to the patchs
first and refine it, any opions is welcome.