Re: [Xen-devel] [PATCH] xen/balloon: don't online new memory initially

From: Juergen Gross
Date: Tue Jul 04 2017 - 01:43:19 EST


On 03/07/17 20:44, Igor Druzhinin wrote:
> On 03/07/17 16:40, Juergen Gross wrote:
>> When setting up the Xenstore watch for the memory target size the new
>> watch will fire at once. Don't try to reach the configured target size
>> by onlining new memory in this case, as the current memory size will
>> be smaller in almost all cases due to e.g. BIOS reserved pages.
>>
>> Onlining new memory will lead to more problems e.g. undesired conflicts
>> with NVMe devices meant to be operated as block devices.
>>
>> Instead remember the difference between target size and current size
>> when the watch fires for the first time and apply it to any further
>> size changes, too.
>>
>> In order to avoid races between balloon.c and xen-balloon.c init calls
>> do the xen-balloon.c initialization from balloon.c.
>>
>> Signed-off-by: Juergen Gross <jgross@xxxxxxxx>
>> ---
>> drivers/xen/balloon.c | 3 +++
>> drivers/xen/xen-balloon.c | 20 ++++++++++++--------
>> include/xen/balloon.h | 8 ++++++++
>> 3 files changed, 23 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
>> index 50dcb68d8070..ab609255a0f3 100644
>> --- a/drivers/xen/balloon.c
>> +++ b/drivers/xen/balloon.c
>> @@ -780,6 +780,9 @@ static int __init balloon_init(void)
>> }
>> #endif
>>
>> + /* Init the xen-balloon driver. */
>> + xen_balloon_init();
>> +
>> return 0;
>> }
>> subsys_initcall(balloon_init);
>> diff --git a/drivers/xen/xen-balloon.c b/drivers/xen/xen-balloon.c
>> index e7715cb62eef..66ec519c825c 100644
>> --- a/drivers/xen/xen-balloon.c
>> +++ b/drivers/xen/xen-balloon.c
>> @@ -59,6 +59,8 @@ static void watch_target(struct xenbus_watch *watch,
>> {
>> unsigned long long new_target;
>> int err;
>> + static bool watch_fired;
>> + static unsigned long target_diff;
>>
>> err = xenbus_scanf(XBT_NIL, "memory", "target", "%llu", &new_target);
>> if (err != 1) {
>> @@ -69,7 +71,14 @@ static void watch_target(struct xenbus_watch *watch,
>> /* The given memory/target value is in KiB, so it needs converting to
>> * pages. PAGE_SHIFT converts bytes to pages, hence PAGE_SHIFT - 10.
>> */
>> - balloon_set_new_target(new_target >> (PAGE_SHIFT - 10));
>> + new_target >>= PAGE_SHIFT - 10;
>> + if (watch_fired) {
>> + balloon_set_new_target(new_target - target_diff);
>> + return;
>> + }
>> +
>> + watch_fired = true;
>> + target_diff = new_target - balloon_stats.target_pages;
>> }
>> static struct xenbus_watch target_watch = {
>> .node = "memory/target",
>> @@ -94,13 +103,8 @@ static struct notifier_block xenstore_notifier = {
>> .notifier_call = balloon_init_watcher,
>> };
>>
>> -static int __init balloon_init(void)
>> +void __init xen_balloon_init(void)
>> {
>> - if (!xen_domain())
>> - return -ENODEV;
>> -
>> - pr_info("Initialising balloon driver\n");
>> -
>> register_balloon(&balloon_dev);
>>
>> register_xen_selfballooning(&balloon_dev);
>> @@ -109,7 +113,7 @@ static int __init balloon_init(void)
>>
>> return 0;
>> }
>> -subsys_initcall(balloon_init);
>> +EXPORT_SYMBOL_GPL(xen_balloon_init);
>>
>> #define BALLOON_SHOW(name, format, args...) \
>> static ssize_t show_##name(struct device *dev, \
>> diff --git a/include/xen/balloon.h b/include/xen/balloon.h
>> index d1767dfb0d95..8906361bb50c 100644
>> --- a/include/xen/balloon.h
>> +++ b/include/xen/balloon.h
>> @@ -35,3 +35,11 @@ static inline int register_xen_selfballooning(struct device *dev)
>> return -ENOSYS;
>> }
>> #endif
>> +
>> +#ifdef CONFIG_XEN_BALLOON
>> +void xen_balloon_init(void);
>> +#else
>> +static inline void xen_balloon_init(void)
>> +{
>> +}
>> +#endif
>>
>
> We came across the same issue just recently. The problem was that for
> some kernel versions DMA buffers for emulated devices are allocated in
> this recently hotplugged area. This area is not properly described for
> QEMU so when a DMA request comes in QEMU treats it as "unassigned" and
> skips by default. This eventually leads to cryptic failures of system
> loading.
>
> Internally we developed a workaround for QEMU with which we try to
> satisfy all the "unassigned" requests. But it doesn't solves the problem
> in a proper way IMHO.
>
> I haven't not completely understood your use-case but we might try come
> up with a general solution for both of the problems because they are
> obviously related.
>
>> Onlining new memory will lead to more problems e.g. undesired conflicts
>> with NVMe devices meant to be operated as block devices.
>
> Could you explain this in more detail?

Please see

https://lists.xen.org/archives/html/xen-devel/2017-03/msg03020.html

for a more detailed discussion.


Juergen