Re: [PATCH v4 2/7] Enable balloon drivers to report inflated memory

From: Nadav Amit
Date: Thu Oct 06 2022 - 17:07:11 EST


On Oct 6, 2022, at 12:34 AM, Alexander Atanasov <alexander.atanasov@xxxxxxxxxxxxx> wrote:

> Hello,
>
>
> On 5.10.22 20:25, Nadav Amit wrote:
>> On Oct 5, 2022, at 2:01 AM, Alexander Atanasov <alexander.atanasov@xxxxxxxxxxxxx> wrote:
>>> Add counters to be updated by the balloon drivers.
>>> Create balloon notifier to propagate changes.
>> I missed the other patches before (including this one). Sorry, but next
>> time, please cc me.
>
> You are CCed in the cover letter since the version. I will add CC to you
> in the individual patches if you want so.

Thanks.

Just to clarify - I am not attacking you. It’s more of me making an excuse
for not addressing some issues in earlier versions.

>> I was looking through the series and I did not see actual users of the
>> notifier. Usually, it is not great to build an API without users.
>
>
> You are right. I hope to get some feedback/interest from potential users that i mentioned in the cover letter. I will probably split the notifier
> in separate series. To make it usefull it will require more changes.
> See bellow more about them.

Fair, but this is something that is more suitable for RFC. Otherwise, more
likely than not - your patches would go in as is.

>> [snip]
>>> +
>>> +static int balloon_notify(unsigned long val)
>>> +{
>>> + return srcu_notifier_call_chain(&balloon_chain, val, NULL);
>> Since you know the inflated_kb value here, why not to use it as an argument
>> to the callback? I think casting to (void *) and back is best. But you can
>> also provide pointer to the value. Doesn’t it sound better than having
>> potentially different notifiers reading different values?
>
> My current idea is to have a struct with current and previous value,
> may be change in percents. The actual value does not matter to anyone
> but the size of change does. When a user gets notified it can act upon
> the change - if it is small it can ignore it , if it is above some threshold it can act - if it makes sense for some receiver is can accumulate changes from several notification. Other option/addition is to have si_meminfo_current(..) and totalram_pages_current(..) that return values adjusted with the balloon values.
>
> Going further - there are few places that calculate something based on available memory that do not have sysfs/proc interface for setting limits. Most of them work in percents so they can be converted to do calculations when they get notification.
>
> The one that have interface for configuration but use memory values can be handled in two ways - convert to use percents of what is available or extend the notifier to notify userspace which in turn to do calculations and update configuration.

I really need to see code to fully understand what you have in mind.
Division, as you know, is not something that we really want to do very
frequently.

>> Anyhow, without users (actual notifiers) it’s kind of hard to know how
>> reasonable it all is. For instance, is it balloon_notify() supposed to
>> prevent further balloon inflating/deflating until the notifier completes?
>
> No, we must avoid that at any cost.
>
>> Accordingly, are callers to balloon_notify() expected to relinquish locks
>> before calling balloon_notify() to prevent deadlocks and high latency?
>
> My goal is to avoid any possible impact on performance. Drivers are free to delay notifications if they get in the way. (I see that i need to move the notification after the semaphore in the vmw driver - i missed that - will fix in the next iterration.)
> Deadlocks - depends on the users but a few to none will possibly have to deal with common locks.

I will need to see the next version to give better feedback. One more thing
that comes to mind though is whether saving the balloon size in multiple
places (both mem_balloon_inflated_total_kb and each balloon’s accounting) is
the right way. It does not sounds very clean.

Two other options is to move *all* the accounting to your new
mem_balloon_inflated_total_kb-like interface or expose some per-balloon
function to get the balloon size (indirect-function-call would likely have
some overhead though).

Anyhow, I am not crazy about having the same data replicated. Even from
reading the code stand-of-view it is not intuitive.