Re: [PATCH] mm: cma: support sysfs

From: John Hubbard
Date: Fri Feb 05 2021 - 23:54:00 EST


On 2/5/21 1:28 PM, Minchan Kim wrote:
On Fri, Feb 05, 2021 at 12:25:52PM -0800, John Hubbard wrote:
On 2/5/21 8:15 AM, Minchan Kim wrote:
...
OK. But...what *is* your goal, and why is this useless (that's what
orthogonal really means here) for your goal?

As I mentioned, the goal is to monitor the failure from each of CMA
since they have each own purpose.

Let's have an example.

System has 5 CMA area and each CMA is associated with each
user scenario. They have exclusive CMA area to avoid
fragmentation problem.

CMA-1 depends on bluetooh
CMA-2 depends on WIFI
CMA-3 depends on sensor-A
CMA-4 depends on sensor-B
CMA-5 depends on sensor-C


aha, finally. I had no idea that sort of use case was happening.

This would be good to put in the patch commit description.

With this, we could catch which module was affected but with global failure,
I couldn't find who was affected.


Also, would you be willing to try out something simple first,
such as providing indication that cma is active and it's overall success
rate, like this:

/proc/vmstat:

cma_alloc_success 125
cma_alloc_failure 25

...or is the only way to provide the more detailed items, complete with
per-CMA details, in a non-debugfs location?



...and then, to see if more is needed, some questions:

a) Do you know of an upper bound on how many cma areas there can be
(I think Matthew also asked that)?

There is no upper bound since it's configurable.


OK, thanks,so that pretty much rules out putting per-cma details into
anything other than a directory or something like it.


b) Is tracking the cma area really as valuable as other possibilities? We can put
"a few" to "several" items here, so really want to get your very favorite bits of
information in. If, for example, there can be *lots* of cma areas, then maybe tracking

At this moment, allocation/failure for each CMA area since they have
particular own usecase, which makes me easy to keep which module will
be affected. I think it is very useful per-CMA statistics as minimum
code change so I want to enable it by default under CONFIG_CMA && CONFIG_SYSFS.

by a range of allocation sizes is better...

I takes your suggestion something like this.

[alloc_range] could be order or range by interval

/sys/kernel/mm/cma/cma-A/[alloc_range]/success
/sys/kernel/mm/cma/cma-A/[alloc_range]/fail
..
..
/sys/kernel/mm/cma/cma-Z/[alloc_range]/success
/sys/kernel/mm/cma/cma-Z/[alloc_range]/fail

Actually, I meant, "ranges instead of cma areas", like this:

/<path-to-cma-data/[alloc_range_1]/success
/<path-to-cma-data/[alloc_range_1]/fail
/<path-to-cma-data/[alloc_range_2]/success
/<path-to-cma-data/[alloc_range_2]/fail
...
/<path-to-cma-data/[alloc_range_max]/success
/<path-to-cma-data/[alloc_range_max]/fail

The idea is that knowing the allocation sizes that succeeded
and failed is maybe even more interesting and useful than
knowing the cma area that contains them.

Understand your point but it would make hard to find who was
affected by the failure. That's why I suggested to have your
suggestion under additional config since per-cma metric with
simple sucess/failure are enough.



I agree it would be also useful but I'd like to enable it under
CONFIG_CMA_SYSFS_ALLOC_RANGE as separate patchset.


I will stop harassing you very soon, just want to bottom out on
understanding the real goals first. :)


I hope my example makes the goal more clear for you.


Yes it does. Based on the (rather surprising) use of cma-area-per-device,
it seems clear that you will need this, so I'll drop my objections to
putting it in sysfs.

I still think the "number of allocation failures" needs refining, probably
via a range-based thing, as we've discussed. But the number of pages
failed per cma looks OK now.



thanks,
--
John Hubbard
NVIDIA