Re: [BUG] set_mempolicy(MPOL_INTERLEAV) cause kernel panic

From: KAMEZAWA Hiroyuki
Date: Fri Jul 24 2009 - 21:34:02 EST


Andrew Morton wrote:
> On Fri, 24 Jul 2009 15:51:51 -0700 (PDT)
> David Rientjes <rientjes@xxxxxxxxxx> wrote:

> afaik we don't have a final patch for this. I asked Motohiro-san about
> this and he's proposing that we revert the offending change (which one
> was it?) if nothing gets fixed soon - the original author is on a
> lengthy vacation.
>
>
> If we _do_ have a patch then can we start again? Someone send out the
> patch
> and let's take a look at it.
Hmm, like this ? (cleaned up David's one because we shouldn't have
extra nodemask_t on stack.)

Problems are
- rebind() is maybe broken but no good idea.
(but it seems to be broken in old kernels
- Who can test this is only a user who has possible node on SRAT.

==
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>

At setting mempolicy's nodemask (or node id), we need to guarantee
node-id is online. But cpuset's nodemask may contain not-online(possible)
nodes and it can cause an access to NODE_DATA(nid) of not-online nodes.

This patch fiexs mempolicy's nodemask to be subset of valid nodes.
(N_HIGH_MEMORY).

But, there are 2 caes for setting policy's mask
- new
- rebind
A difficult case is rebind. In this patch, if relationship of
new cpuset's nodemask & policy's mask is invalid, just use cpuset's
mask.

Based on David Rientjes's patch.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
---
mm/mempolicy.c | 25 ++++++++++++++++++++++---
1 file changed, 22 insertions(+), 3 deletions(-)

Index: mmotm-2.6.31-Jul16/mm/mempolicy.c
===================================================================
--- mmotm-2.6.31-Jul16.orig/mm/mempolicy.c
+++ mmotm-2.6.31-Jul16/mm/mempolicy.c
@@ -204,12 +204,22 @@ static int mpol_set_nodemask(struct memp
if (pol->mode == MPOL_PREFERRED && nodes_empty(*nodes))
nodes = NULL; /* explicit local allocation */
else {
+ /*
+ * Here, we mask this new nodemask with N_HIGH_MEMORY.
+ * An issue is memory hotplug. Now, at hot-add, we don't
+ * update, this. This should be fixed. At hot-remove, we don't
+ * remove pgdat itself, then, we should update this but
+ * we'll never see terrible bugs. Leaving it as it is, now.
+ */
+ nodes_and(cpuset_context_mask, &cpuset_current_mems_allowed,
+ node_states[N_HIGH_MEMORY]);
+ /* should we call is_valid_nodemask() here ?*/
if (pol->flags & MPOL_F_RELATIVE_NODES)
mpol_relative_nodemask(&cpuset_context_nmask, nodes,
- &cpuset_current_mems_allowed);
+ &cpuset_context_nmask);
else
nodes_and(cpuset_context_nmask, *nodes,
- cpuset_current_mems_allowed);
+ cpuset_context_nmask);
if (mpol_store_user_nodemask(pol))
pol->w.user_nodemask = *nodes;
else
@@ -290,7 +300,16 @@ static void mpol_rebind_nodemask(struct
*nodes);
pol->w.cpuset_mems_allowed = *nodes;
}
-
+ /*
+ * At rebind, passed *nodes is guaranteed to online, but..calculated
+ * nodemask can be empty or invalid. print WARNING and use cpuset's
+ * mask
+ */
+ if (nodes_empty(tmp) ||
+ (pol->mode == MPOL_BIND && !is_valid_nodemask(tmp))) {
+ tmp = *nodes;
+ printk("relation amoung cpuset/mempolicy goes bad.\n");
+ }
pol->v.nodes = tmp;
if (!node_isset(current->il_next, tmp)) {
current->il_next = next_node(current->il_next, tmp);







--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/