Re: [PATCH 2/4] x86, numa: Do not adjust start/end forearly_node_mem()

From: Tejun Heo
Date: Tue Feb 22 2011 - 05:37:05 EST


On Mon, Feb 21, 2011 at 12:28:20PM -0800, Yinghai Lu wrote:
> > Hmmm... thinking more about it, there actually is a difference.
> > Depending on configuration, the new code allows node_data[] to be
> > allocated below DMA boundary. I think we need to keep the first if().
> > Areas crossing the boundaries is okay, in fact, the original code
> > already allowed that when the NUMA affine allocation failed; however,
> > node_data[] was never allowed below the DMA boundary and I think it
> > shouldn't be.
>
> No. when those code were added before. it was bottom-up allocation from e820.
> Now with new memblock allocation. it will always try to do top down.
> will have no chance to get under DMA normally.
> except your first node only has < 16M.

NODE_MIN_SIZE is 4M. Crazy SRAT isn't unheard of. Even on a sane
configuration, given how the physical meomory is laid out, an emulated
NUMA node can easily end up with only <= 16M memory.

Also, your patch makes the code inconsistent. The NUMA aware
allocation doesn't put any restriction but if it fails we do generic
allocation with DMA boundary limit.

So, don't remove the DMA boundary limit. Even if the current code
wouldn't result in that (and the current code CAN), it is a good
sanity check to have because you don't want memory map under 16MiB no
matter what.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/