Re: + bootmem-node-setup-agnostic-free_bootmem.patch added to -mm tree

From: Johannes Weiner
Date: Tue Apr 15 2008 - 15:55:33 EST


Hi,

"Yinghai Lu" <yhlu.kernel@xxxxxxxxx> writes:

> On Tue, Apr 15, 2008 at 5:51 AM, Johannes Weiner <hannes@xxxxxxxxxxxx> wrote:
>> Hi Ingo,
>>
>>
>>
>> Ingo Molnar <mingo@xxxxxxx> writes:
>>
>> > * akpm@xxxxxxxxxxxxxxxxxxxx <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
>> >
>> >> Subject: bootmem: node-setup agnostic free_bootmem()
>> >> From: Johannes Weiner <hannes@xxxxxxxxxxxx>
>> >>
>> >> Make free_bootmem() look up the node holding the specified address
>> >> range which lets it work transparently on single-node and multi-node
>> >> configurations.
>> >
>> > this patch does not fix the bug Yinghai's (now dropped) patches solved:
>> > reserve_early() allocations. So NAK until the full problem has been
>> > sorted out ...
>>
>> Okay, NAK on -mm and -x86 for sure. The patch was meant for mainline
>> where there is no need for free_bootmem() going across nodes, right?
>>
>> But I still object to the way Yinghai implemented it.
>> free_bootmem_core() should not be twisted like this.
>>
>> How about the following (untested, even uncompiled, but you should get
>> the idea) proposal which would replace the patch discussed in this
>> thread:
>>
>> --- tree-linus.orig/mm/bootmem.c
>> +++ tree-linus/mm/bootmem.c
>> @@ -421,7 +421,25 @@ int __init reserve_bootmem(unsigned long
>>
>>
>> void __init free_bootmem(unsigned long addr, unsigned long size)
>> {
>> - free_bootmem_core(NODE_DATA(0)->bdata, addr, size);
>> + bootmem_data_t *bdata;
>> +
>> + list_for_each_entry(bdata, &bdata_list, list) {
>> + unsigned long remainder = 0;
>>
>> +
>> + if (addr < bdata->node_boot_start)
>> + continue;
>> +
>> + if (PFN_DOWN(addr + size) > bdata->node_low_pfn)
>> + remainder = PFN_DOWN(addr + size) - bdata->node_low_pfn;
>> +
>> + size -= PFN_PHYS(remainder);
>>
>> + free_bootmem_core(bdata, addr, size)
>> +
>> + if (!remainder)
>> + break;
>> +
>> + addr = PFN_PHYS(bdata->node_low_pfn + 1);
>> + }
>>
>> }
>>
>> unsigned long __init free_all_bootmem(void)
>
> how about
> 1. bdata is not sorted?

They are kept in a sorted list. How could they be unsorted?

> 2. intel cross node box: node0: 0g-2g, 4g-6g, node1: 2g-4g, 6g-8g. i
> don't think they have two bdata struct for every node.

How do the bdata structures represent this setup right now? Are you
sure that there is not a node descriptor for every contiguous region?

Hannes
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/