Re: [PATCH] Add __GFP_MOVABLE for callers to flag allocations thatmay be migrated

From: Mel Gorman
Date: Tue Dec 05 2006 - 05:03:51 EST


On Tue, 5 Dec 2006, KAMEZAWA Hiroyuki wrote:

Hi, your plan looks good to me.

Thanks.

some comments.

On Mon, 4 Dec 2006 23:45:32 +0000 (GMT)
Mel Gorman <mel@xxxxxxxxx> wrote:
1. Use lumpy-reclaim to intelligently reclaim contigous pages. The same
logic can be used to reclaim within a PFN range
2. Merge anti-frag to help high-order allocations, hugetlbpage
allocations and freeing up SPARSEMEM sections of memory

For freeing up SPARSEMEM sections of memory ?

Yes.

It looks that you assumes MAX_ORDER_NR_PAGES equals to PAGES_PER_SECTION.
plz don't assume that when you talk about generic arch code.


Yes, I was making the assumption that MAX_ORDER would be increased when memory hot-remove was possible so that MAX_ORDER_NR_PAGES == PAGES_PER_SECTION. I think it would be a reasonable restriction unless section sizes can get really large.

3. Anti-frag includes support for page flags that affected a MAX_ORDER block
of pages. These flags can be used to mark a section of memory that should
not be allocated from. This is of interest to both hugetlb page allocatoin
and memory hot-remove. Use the flags to mark a MAX_ORDER_NR_PAGES that
is currently being freed up and shouldn't be allocated.

4. Use anti-frag fallback logic to bias unmovable allocations to the lower
PFNs.

I think this can be one of the most diffcult things ;)


It depends on how aggressive the bias is. If it's just "try and keep them low but don't work too hard", then it's a simple case of searching the free list at the order we are falling back to. If the lowest MAX_ORDER_NR_PAGES must be reclaimed for use by unmovable allocations, it gets harder but it's not impossible.

5. Add arch support where possible for offlining sections of memory that
can be powered down.

I had a patch for ACPI-memory-hot-unplug, which ties memory sections to memory
chunk on ACPI.


Great, I had a strong feeling something like that existed.

6. Add arch support where possible to power down a DIMM when the memory
sections that make it up have been offlined. This is an extenstion of
step 5 only.
7. Add a zone that only allows __GFP_MOVABLE allocations so that
sections can 100% be reclaimed and powered-down
8. Allow nodes to only have a zone for __GFP_MOVABLE allocations so that
whole nodes can be offlined.

I (numa-node-hotplug) needs 7 and 8 basically. And Other people may not.

Power would be more interested in 2 for SPARSEMEM sections. Also, while getting 7 and 8 right will be important, it won't help stuff like the e1000 problem which is why I put it towards the end. I'm going to relook at the adding-zone patches today and see if they can be brought forward so we can take a proper look.

IMHO:
For DIMM unplug, I suspect that we'll finally divide memory to core-memory and
hot-pluggable by pgdat even on SMP. If we use pgdat for that purpose, almost all
necessary infrastructure(statistics ,etc..) is ready.
But if we find better way, it's good.


That is one possibility. There are people working on fake nodes for containers at the moment. If that pans out, the infrastructure would be available to create one node per DIMM.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/