Re: [PATCH v4 1/2] rcu/tree: Add basic support for kfree_rcu() batching

From: Uladzislau Rezki
Date: Tue Oct 01 2019 - 07:27:15 EST


> > Hello, Joel.
> >
> > First of all thank you for improving it. I also noticed a high pressure
> > on RCU-machinery during performing some vmalloc tests when kfree_rcu()
> > flood occurred. Therefore i got rid of using kfree_rcu() there.
>
> Replying a bit late due to overseas conference travel and vacation.
>
> When you say 'high pressure', do you mean memory pressure or just system
> load?
>
>
> Memory pressure slightly increases with the kfree_rcu() rework with the
> benefit of much fewer grace periods.
>
I meant a system load, because of high number of cycles in the kfree_rcu()
symbol under stressing. But i do not have numbers next to me, because it
was quite a long time ago. As for memory usage, i understand that.

> > I have just a small question related to workloads and performance evaluation.
> > Are you aware of any specific workloads which benefit from it for example
> > mobile area, etc? I am asking because i think about backporting of it and
> > reuse it on our kernel.
>
> I am not aware of a mobile usecase that benefits but there are server
> workloads that make system more stable in the face of a kfree_rcu() flood.
>
OK, i got it. I wanted to test it finding out how it could effect mobile
workloads.

>
> For the KVA allocator work, I see it is quite similar to the way binder
> allocates blocks. See function: binder_alloc_new_buf_locked(). Is there are
> any chance to reuse any code? For one thing, binder also has an rbtree for
> allocated blocks for fast lookup of allocated blocks. Does the KVA allocator
> not have the need for that?
>
Well, there is a difference. Actually the free blocks are not sorted by
the its size like in binder layer, if understand the code correctly.

Instead, i keep them(free blocks) sorted(by start address) in ascending
order + maintain the augment value(biggest free size in left or right sub-tree)
for each node, that allows to navigate toward the lowest address and the block
that definitely suits. So as a result our allocations become sequential
what is important.

>
> And, nice LPC presentation! I was there ;)
>
Thanks :)

--
Vlad Rezki