Re: [PATCH 5/6] Makes procs file writable to move all threads by tgid at once

From: Benjamin Blum
Date: Fri Jul 24 2009 - 16:54:04 EST

Next message: Andreas Dilger: "Re: fanotify - overall design before I start sending patches"
Previous message: david: "Re: fanotify - overall design before I start sending patches"
In reply to: Paul Menage: "Re: [PATCH 5/6] Makes procs file writable to move all threads by tgid at once"
Next in thread: Matt Helsley: "Re: [PATCH 5/6] Makes procs file writable to move all threads bytgid at once"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, Jul 24, 2009 at 1:47 PM, Paul Menage<menage@xxxxxxxxxx> wrote:
> On Fri, Jul 24, 2009 at 10:23 AM, Matt Helsley<matthltc@xxxxxxxxxx> wrote:
>>
>> Well, I imagine holding tasklist_lock is worse than cgroup_mutex in some
>> ways since it's used even more widely. Makes sense not to use it here..
>
> Just to clarify - the new "procs" code doesn't use cgroup_mutex for
> its critical section, it uses a new cgroup_fork_mutex, which is only
> taken for write during cgroup_proc_attach() (after all setup has been
> done, to ensure that no new threads are created while we're updating
> all the existing threads). So in general there'll be zero contention
> on this lock - the cost will be the cache misses due to the rwlock
> bouncing between the different CPUs that are taking it in read mode.

Right. The different options so far are:

Global rwsem: only needs one lock, but prevents all forking when a
write is in progress. It should be quick enough, if it's just "iterate
down the threadgroup list in O(n)". In the good case, fork() slows
down by a cache miss when taking the lock in read mode.
Threadgroup-local rwsem: Needs adding a field to task_struct. Only
forks within the same threadgroup would block on a write to the procs
file, and the zero-contention case is the same as before.
Using tasklist_lock: Currently, the call to cgroup_fork() (which
starts the race) is very far above where tasklist_lock is taken in
fork, so taking tasklist_lock earlier is very infeasible. Could
cgroup_fork() be moved downwards to inside it, and if so, how much
restructuring would be needed? Even if so, this still adds stuff that
is being done (unnecessarily) while holding a global mutex.

> What happened to the big-reader lock concept from 2.4.x? That would be
> applicable here - minimizing the overhead on the critical path when
> the write operation is expected to be very rare.

Seems like a good application, but it appears to be gone in the
current kernel. Also, from my understanding, it would have to be a
global (or at least not threadgroup-local) lock, no? Were we to use
this and try to write to the procs file while a bunch of forks are in
progress, how long would the write operation have to block? (that is,
at least with a rwsem, the writing thread seems to get the lock rather
quickly when there's contention.) Depending on just how slow
write-locking one of these is, it might kill any hopes of performing a
write while forks are in progress.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Andreas Dilger: "Re: fanotify - overall design before I start sending patches"
Previous message: david: "Re: fanotify - overall design before I start sending patches"
In reply to: Paul Menage: "Re: [PATCH 5/6] Makes procs file writable to move all threads by tgid at once"
Next in thread: Matt Helsley: "Re: [PATCH 5/6] Makes procs file writable to move all threads bytgid at once"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]