Re: [patch 2/2] bootmem: Node-setup agnostic free_bootmem()

From: Yinghai Lu
Date: Tue Apr 15 2008 - 14:44:41 EST


On Tue, Apr 15, 2008 at 4:53 AM, Johannes Weiner <hannes@xxxxxxxxxxxx> wrote:
>
> Andi Kleen <andi@xxxxxxxxxxxxxx> writes:
>
> > Andrew Morton wrote:
> >> On Sun, 13 Apr 2008 18:56:57 +0200 Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:
> >>
> >>> Johannes Weiner <hannes@xxxxxxxxxxxx> writes:
> >>>
> >>>> Make free_bootmem() look up the node holding the specified address
> >>>> range which lets it work transparently on single-node and multi-node
> >>>> configurations.
> >>> Acked-by: Andi Kleen <andi@xxxxxxxxxxxxxx>
> >>>
> >>> This is far better than the original change it replaces and which
> >>> I also objected to in review.
> >>>
> >>
> >> So... do we think these two patches are sufficiently safe and important for
> >> 2.6.25?
> >
> > It's only strictly needed for .26 I think for some (also slightly
> > dubious) changes queued in git-x86.
>
> Does anything yet rely on this new free_bootmem() behaviour? If not,
> the safest thing would be to just revert the original patch in mainline
> and drop the second patch completely.

1. free_bootmem(ramdisk_image, ramdisk_size) in setup_arch of x86_64 need that
2. another patch in x86.git need that.

YH

commit f62f1fc9ef94f74fda2b456d935ba2da69fa0a40
Author: Yinghai Lu <yhlu.kernel@xxxxxxxxx>
Date: Fri Mar 7 15:02:50 2008 -0800

x86: reserve dma32 early for gart

a system with 256 GB of RAM, when NUMA is disabled crashes the
following way:

Your BIOS doesn't leave a aperture memory hole
Please enable the IOMMU option in the BIOS setup
This costs you 64 MB of RAM
Cannot allocate aperture memory hole (ffff8101c0000000,65536K)
Kernel panic - not syncing: Not enough memory for aperture
Pid: 0, comm: swapper Not tainted 2.6.25-rc4-x86-latest.git #33

Call Trace:
[<ffffffff84037c62>] panic+0xb2/0x190
[<ffffffff840381fc>] ? release_console_sem+0x7c/0x250
[<ffffffff847b1628>] ? __alloc_bootmem_nopanic+0x48/0x90
[<ffffffff847b0ac9>] ? free_bootmem+0x29/0x50
[<ffffffff847ac1f7>] gart_iommu_hole_init+0x5e7/0x680
[<ffffffff847b255b>] ? alloc_large_system_hash+0x16b/0x310
[<ffffffff84506a2f>] ? _etext+0x0/0x1
[<ffffffff847a2e8c>] pci_iommu_alloc+0x1c/0x40
[<ffffffff847ac795>] mem_init+0x45/0x1a0
[<ffffffff8479ff35>] start_kernel+0x295/0x380
[<ffffffff8479f1c2>] _sinittext+0x1c2/0x230

the root cause is : memmap PMD is too big,
[ffffe200e0600000-ffffe200e07fffff] PMD ->ffff81383c000000 on node 0
almost near 4G..., and vmemmap_alloc_block will use up the ram under 4G.

solution will be:
1. make memmap allocation get memory above 4G...
2. reserve some dma32 range early before we try to set up memmap for all.
and release that before pci_iommu_alloc, so gart or swiotlb could get some
range under 4g limit for sure.

the patch is using method 2.
because method1 may need more code to handle SPARSEMEM and SPASEMEM_VMEMMAP

will get
Your BIOS doesn't leave a aperture memory hole
Please enable the IOMMU option in the BIOS setup
This costs you 64 MB of RAM
Mapping aperture over 65536 KB of RAM @ 4000000
Memory: 264245736k/268959744k available (8484k kernel code,
4187464k reserved, 4004k data, 724k init)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/