RE: [PATCH 24/32] Task fork and exit for rdtgroup

From: Thomas Gleixner
Date: Wed Jul 13 2016 - 17:05:11 EST


On Wed, 13 Jul 2016, Yu, Fenghua wrote:
> On Wed, July 2016, Thomas Gleixner wrote
> > On Tue, 12 Jul 2016, Fenghua Yu wrote:
> > > +void rdtgroup_post_fork(struct task_struct *child) {
> > > + if (!use_rdtgroup_tasks)
> > > + return;
> > > +
> > > + spin_lock_irq(&rdtgroup_task_lock);
> > > + if (list_empty(&child->rg_list)) {
> >
> > Why would the list be non empty after a fork?
>
> In this situation for a pid:
> 1.rdtgroup_fork(): rg_list=null.
> 2.setup_task_rg_lists(): rg_list is setup
> 3.rdtgroup_fork(): rg_list is not empty

Why would rdtgroup_fork() be called twice for a given thread?

> This situation happens only during rscctrl mount time. Before mount, post_fork()
> returns from !use_rdtgroup_tasks and doesn't set up rg_list. After mount, rg_list()
> is always empty in post_fork(). But we need to check rg_list for above situation.
>
> Does that make sense?

No, that does not make any sense at all.

> Any suggestion for better soluation?

The problem you have is:

fork
list_init(rg_list);
write_lock(tasklist_lock);

task becomes visible

write_unlock(tasklist_lock);

rdtgroup_post_fork();
if (!use_rdtgroup_tasks)
return;

spin_lock_irq(&rdtgroup_task_lock);
list_add();
spin_unlock_irq(&rdtgroup_task_lock);

I have no idea why this lock must be taken with _irq, but thats another
story. Let's look at the mount side:

spin_lock_irq(&rdtgroup_task_lock);
read_lock(&tasklist_lock);

do_each_thread(g, p) {
WARN_ON(More magic crap happening there)

spin_lock_irq(&p->sighand->siglock);
list_add();
spin_unlock_irq(&p->sighand->siglock);
^^^^
Great: You undo the irq disable of (&rdtgroup_task_lock) above! Oh well....

read_unlock(&tasklist_lock);
spin_unlock_irq(&rdtgroup_task_lock);

So you need all this magic in rdtgroup_post_fork() and setup_task_rg_lists()
just because you blindly positioned rdtgroup_post_fork() at the point where
the cgroup_post_fork() stuff is. But you did not think a second about the
locking rules here otherwise they would be documented somewhere.

You need a read_lock(&tasklist_lock) for the mount part anyway. So why don't
you do the obvious:

fork
list_init(rg_list);
write_lock(tasklist_lock);

rdtgroup_post_fork();
if (use_rdtgroup_tasks)
spin_lock(&rdtgroup_task_lock);
list_add();
spin_unlock(&rdtgroup_task_lock);

task becomes visible

write_unlock(tasklist_lock);

And reorder the lock ordering in the mount path:

read_lock(&tasklist_lock);
spin_lock(&rdtgroup_task_lock);

Now using rdtgroup_task_lock to protect current->rdtgroup is horrible as
well. You need task->sighand->siglock in the mount path anyway to prevent exit
races. So you can simplify the whole magic to:

fork
list_init(rg_list);
write_lock(tasklist_lock);

spin_lock(&current->sighand->siglock);

rdtgroup_post_fork();
if (use_rdtgroup_tasks)
list_add();

spin_unlock(&current->sighand->siglock);
write_unlock(tasklist_lock);

That removes an extra lock/unlock operation from the fork path because
current->sighand->siglock is taken inside of the tasklist_lock write locked
section already.

So you need protection for use_rdtgroup_task, which is a complete misnomer
btw. (rdtgroup_active would be too obvious, right?). That protection is simple
because you can set that flag with tasklist_lock read locked which you hold
anyway for iterating all threads in the mount path.

Aside of that you need to take tsk->sighand->siglock when you change
tsk->rdtgroup, but that's a no-brainer and it gives you the extra benefit that
you can protect such an operation against exit of the task that way by
checking PF_EXITING under the lock. I don't see any protection against exit in
your current implementation when a task is moved to a different partition.

Please sit down and describe the complete locking and protection scheme of
this stuff. I'm not going to figure this out from the obscure code another
time.

Thanks,

tglx