Re: [PATCH] cpuset: fix allocating page cache/slab object on theunallowed node when memory spread is set

From: Miao Xie
Date: Wed Feb 04 2009 - 04:05:51 EST


on 2009-2-4 6:16 Andrew Morton wrote:
> On Tue, 03 Feb 2009 11:25:25 +0800
> Miao Xie <miaox@xxxxxxxxxxxxxx> wrote:
>
>> on 2009-1-28 6:42 Andrew Morton wrote:
>>> On Wed, 21 Jan 2009 16:06:20 +0800
>>> Miao Xie <miaox@xxxxxxxxxxxxxx> wrote:
>>>
>>>> The task still allocated the page caches on old node after modifying its
>>>> cpuset's mems when 'memory_spread_page' was set, it is caused by the old
>>>> mem_allowed_list of the task, the current kernel doesn't updates it unless some
>>>> function invokes cpuset_update_task_memory_state(), it is too late sometimes.
>>>> We must update the mem_allowed_list of the tasks in time.
>>>>
>>>> Slab has the same problem.
>>>>
>>>> We fixes the bug by updating tasks' mem_allowed_list and spread flag after
>>>> its cpuset's mems or spread flag is changed.
>>>>
>>>>
>>>> ...
>>>>
>>>> --- a/kernel/kthread.c
>>>> +++ b/kernel/kthread.c
>>>> @@ -242,6 +242,7 @@ int kthreadd(void *unused)
>>>> set_user_nice(tsk, KTHREAD_NICE_LEVEL);
>>>> set_cpus_allowed_ptr(tsk, CPU_MASK_ALL_PTR);
>>>>
>>>> + current->mems_allowed = node_possible_map;
>>>> current->flags |= PF_NOFREEZE | PF_FREEZER_NOSIG;
>>> Why this change? kthreadd() is called from rest_init(), before anyone
>>> has had a chance to alter ->mems_allowed?
>> I found that after mems_allowed of kthreadd was not initialized applying this patch,
>> every bit of it is 1, so...
>> Maybe it is redundant.
>
> I think it is redundant. kthreadd's mems_allowed _should_ be all-ones.
> Or at least, all-nodes-allowed.
>
> I wasn't able to find out where the setting of init'smems_allowed
> happens, after a bit of grepping and hunting. It should be done within
> INIT_TASK, but isn't.

I found it. Call trace is following:
start_kernel()
->build_all_zonelists()
->cpuset_init_current_mems_allowed()
->nodes_setall(current->mems_allowed);
then, it is inherited by kthreadd.

>
> Still, kthreadd is reliably parented by swapper, and there shold be no
> need to alter its mems_allowed.

but it is strange to users because there are not so many nodes in the system in fact,
so I think using node_possible_map to set init's mems_allowed is better.

> Similarly, what was the reason for setting current->mems_allowed in
> kernel_init()? That also should be unneeded.

The same as above.

>
> Finally, I've somewhat lost track of where we are with this patch.
> Paul, do you see any other remaining issues?
>
>
>
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/