Re: [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking

From: Xiaqing (A)
Date: Wed Oct 21 2020 - 23:47:59 EST




On 2020/10/22 10:45, Roman Gushchin wrote:

On Thu, Oct 22, 2020 at 09:54:53AM +0800, Xiaqing (A) wrote:

On 2020/10/17 6:52, Roman Gushchin wrote:

This small patchset makes cma_release() non-blocking and simplifies
the code in hugetlbfs, where previously we had to temporarily drop
hugetlb_lock around the cma_release() call.

It should help Zi Yan on his work on 1 GB THPs: splitting a gigantic
THP under a memory pressure requires a cma_release() call. If it's
a blocking function, it complicates the already complicated code.
Because there are at least two use cases like this (hugetlbfs is
another example), I believe it's just better to make cma_release()
non-blocking.

It also makes it more consistent with other memory releasing functions
in the kernel: most of them are non-blocking.


Roman Gushchin (2):
mm: cma: make cma_release() non-blocking
mm: hugetlb: don't drop hugetlb_lock around cma_release() call

mm/cma.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++--
mm/hugetlb.c | 6 ------
2 files changed, 49 insertions(+), 8 deletions(-)

I don't think this patch is a good idea.It transfers part or even all of the time of
cma_release to cma_alloc, which is more concerned by performance indicators.
I'm not quite sure: if cma_alloc() is racing with cma_release(), cma_alloc() will
wait for the cma_lock mutex anyway. So we don't really transfer anything to cma_alloc().

On Android phones, CPU resource competition is intense in many scenarios,
As a result, kernel threads and workers can be scheduled only after some ticks or more.
In this case, the performance of cma_alloc will deteriorate significantly,
which is not good news for many services on Android.
Ok, I agree, if the cpu is heavily loaded, it might affect the total execution time.

If we aren't going into the mutex->spinlock conversion direction (as Mike suggested),
we can address the performance concerns by introducing a cma_release_nowait() function,
so that the default cma_release() would work in the old way.
cma_release_nowait() can set an atomic flag on a cma area, which will cause following
cma_alloc()'s to flush the release queue. In this case there will be no performance
penalty unless somebody is using cma_release_nowait().
Will it work for you?

That looks good to me.

Thanks!


Thank you!