Re: RFC [PATCH v4 2/7] Enable balloon drivers to report inflated memory

From: Alexander Atanasov
Date: Fri Oct 14 2022 - 09:33:15 EST


On 14.10.22 16:01, David Hildenbrand wrote:
On 14.10.22 14:50, Alexander Atanasov wrote:
Hello,

On 11.10.22 12:23, David Hildenbrand wrote:
Sounds to me that all you want is some notifier to be called from
adjust_managed_page_count(). What am I missing?

Notifier will act as an accumulator to report size of change and it
will make things easier for the drivers and users wrt locking.
Notifier is similar to the memory hotplug notifier.

Overall, I am not convinced that there is any value of separating the
value
and the notifier. You can batch both or not batch both. In addition,
as I
mentioned, having two values seems racy.

I have identified two users so far above - may be more to come.
One type needs the value to adjust. Also having the value is necessary
to report it to users and oom. There are options with callbacks and so
on but it will complicate things with no real gain. You are right about
the atomicity but i guess if that's a problem for some user it could
find a way to ensure it. i am yet to find such place.


I haven't followed the whole discussion, but I just wanted to raise that
having a generic mechanism to notify on such changes could be valuable.

For example, virtio-mem also uses adjust_managed_page_count() and might
sometimes not trigger memory hotplug notifiers when adding more memory
(essentially, when it fake-adds memory part of an already added Linux
memory block).

What might make sense is schedule some kind of deferred notification on
adjust_managed_page_count() changes. This way, we could notify without
caring about locking and would naturally batch notifications.

adjust_managed_page_count() users would not require changes.

Making it deferred will bring issues for both the users of the
adjust_managed_page_count and the receivers of the notification -
locking as first. And it is hard to know when the adjustment will
finish, some of the drivers wait and retry in blocks. It will bring
complexity and it will not be possible to convert users in small steps.

What exactly is the issue about handling that deferred? Who needs an immediate, 100% precise notification? >
Locking from a separate workqueue shouldn't be too hard, or what am i missing?


We do not need immediate but most of the current callers of adjust_managed_page_count work in +1/-1 updates - so we want to defer the notification until they are done with changes. Deferring to a wq is not the problem, it would need to be done most likely.



Other problem is that there are drivers that do not use
adjust_managed_page_count().

Which ones? Do we care?

VMWare and Virtio balloon drivers. I recently proposed to unify them and the objection was that it would break existing users - which is valid so we must care i guess.

--
Regards,
Alexander Atanasov