Re: [PATCH v4 1/2] pid: Replace pid bitmap implementation with IDR API

From: Gargi Sharma
Date: Tue Oct 10 2017 - 12:11:52 EST


On Tue, Oct 10, 2017 at 4:46 PM, Rik van Riel <riel@xxxxxxxxxxx> wrote:
> On Tue, 2017-10-10 at 13:35 +0100, Gargi Sharma wrote:
>> On Tue, Oct 10, 2017 at 12:50 PM, Oleg Nesterov <oleg@xxxxxxxxxx>
>> wrote:
>> > On 10/09, Andrew Morton wrote:
>> > >
>> > > > @@ -240,17 +230,11 @@ void zap_pid_ns_processes(struct
>> > > > pid_namespace *pid_ns)
>> > > > *
>> > > > */
>> > > > read_lock(&tasklist_lock);
>> > > > - nr = next_pidmap(pid_ns, 1);
>> > > > - while (nr > 0) {
>> > > > - rcu_read_lock();
>> > > > -
>> > > > - task = pid_task(find_vpid(nr), PIDTYPE_PID);
>> > > > + nr = 2;
>> > > > + idr_for_each_entry_continue(&pid_ns->idr, pid, nr) {
>> > > > + task = pid_task(pid, PIDTYPE_PID);
>> > > > if (task && !__fatal_signal_pending(task))
>> > > > send_sig_info(SIGKILL, SEND_SIG_FORCED,
>> > > > task);
>> > > > -
>> > > > - rcu_read_unlock();
>> > > > -
>> > > > - nr = next_pidmap(pid_ns, nr);
>> > > > }
>> > > > read_unlock(&tasklist_lock);
>> > >
>> > > Especially here. I don't think pidmap_lock is held. Is that IDR
>> > > iteration safe?
>> >
>> > Yes, this doesn't look right, we need rcu_read_lock() or
>> > pidmap_lock.
>> >
>> > And, we also need rcu_read_lock() for another reason, to protect
>> > "struct pid".
>>
>> Ah, I missed this. From what I understood idr_for_each_entry_continue
>> should be safe because calls idr_get_next which in turn calls
>> radix_tree_iter_find to find the next populated entry in the idr. If
>> the pid that you are looking up the task for is deleted, task will
>> get
>> a NULL from pid_task and no signal to kill will be sent.
>> >
>> > Gargi, I suggested to use idr_for_each_entry_continue(), but now I
>> > am wondering
>> > if we should use idr_for_each() instead. IIUC this would be a bit
>> > faster? Not
>> > that I think this is really important...
>>
>> I can run benchmarks with idr_for_each to see how much speed up is
>> achieved and then we can go with whatever we think is better. How
>> does
>> that sounds?
>
> I suspect this code will not be a hot path in any
> conceivable "kill off hundreds of containers"
> benchmark, since the overhead of having all of the
> tasks in those containers exit will dwarf any
> changes in this code.
>
> Simply making it safe for fully preemptible
> kernels by adding rcu_read_lock() around the
> section is what matters the most.
>
> The choice between idr_for_each_entry_continue()
> and idr_for_each() is dictated more by which
> of the two results in easier to read code.

I have listed down the code for both idr_for_each and idr_for_each_entry.
IMHO idr_for_each_entry is easier to read, but YMMV. :)

void kill_task(int id, void *ptr, void *data)
{
struct *pid = ptr;
struct task_struct *task = pid_task(pid, PIDTYPE_PID);
if (task && !__fatal_signal_pending(task))
send_sig_info(SIGKILL, SEND_SIG_FORCED, task);
}

rcu_read_unlock();
idr_for_each(&pid_ns->idr, &kill_task, NULL);
rcu_read_unlock();


VS

idr_for_each_entry_continue(&pid_ns->idr, pid, nr) {
task = pid_task(pid, PIDTYPE_PID);
if (task && !__fatal_signal_pending(task))
send_sig_info(SIGKILL, SEND_SIG_FORCED, task);
}

Thanks!
Gargi
>
> --
> All rights reversed