Re: [PATCH v5 3/6] seccomp: introduce writer locking

From: Kees Cook
Date: Fri May 23 2014 - 17:05:37 EST


On Fri, May 23, 2014 at 1:49 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Thu, May 22, 2014 at 04:05:33PM -0700, Kees Cook wrote:
>> Normally, task_struct.seccomp.filter is only ever read or modified by
>> the task that owns it (current). This property aids in fast access
>> during system call filtering as read access is lockless.
>>
>> Updating the pointer from another task, however, opens up race
>> conditions. To allow cross-task filter pointer updates, writes to the
>> seccomp fields are now protected by a spinlock. Read access remains
>> lockless because pointer updates themselves are atomic. However, writes
>> (or cloning) often entail additional checking (like maximum instruction
>> counts) which require locking to perform safely.
>>
>> In the case of cloning threads, the child is invisible to the system
>> until it enters the task list. To make sure a child can't be cloned
>> from a thread and left in a prior state, seccomp duplication is moved
>> under the tasklist_lock. Then parent and child are certain have the same
>> seccomp state when they exit the lock.
>>
>
> So I'm a complete noob on the whole seccomp thing, so maybe this is a
> silly question, but.. what about object lifetimes?

The get/put logic on seccomp filters eluded me when I first looked at
it too. :) Basically, each branch point holds counts, which means a
given filter will only get freed when all tasks using it have died.

> Looking at put_seccomp_filter() it explicitly takes a tsk pointer,
> suggesting one can call it on !current. And while it does a dec_and_test
> on the object itself, run_filter() does nothing with refcounts, and
> therefore can be touching dead memory.

That's technically true, but the only caller of put_seccomp_filter()
is free_task(), for which "current" doesn't make sense. But when
called, the task is no longer part of the task_list, so there's no
dead memory touching. (Unless you see something I don't.)

-Kees

--
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/