Re: [PATCH] files: rcu free files_struct

From: Eric W. Biederman
Date: Thu Dec 10 2020 - 14:30:55 EST


Al Viro <viro@xxxxxxxxxxxxxxxxxx> writes:

> On Wed, Dec 09, 2020 at 03:32:38PM -0600, Eric W. Biederman wrote:
>> Al Viro <viro@xxxxxxxxxxxxxxxxxx> writes:
>>
>> > On Wed, Dec 09, 2020 at 11:13:38AM -0800, Linus Torvalds wrote:
>> >> On Wed, Dec 9, 2020 at 10:05 AM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
>> >> >
>> >> > - struct file * file = xchg(&fdt->fd[i], NULL);
>> >> > + struct file * file = fdt->fd[i];
>> >> > if (file) {
>> >> > + rcu_assign_pointer(fdt->fd[i], NULL);
>> >>
>> >> This makes me nervous. Why did we use to do that xchg() there? That
>> >> has atomicity guarantees that now are gone.
>> >>
>> >> Now, this whole thing should be called for just the last ref of the fd
>> >> table, so presumably that atomicity was never needed in the first
>> >> place. But the fact that we did that very expensive xchg() then makes
>> >> me go "there's some reason for it".
>> >>
>> >> Is this xchg() just bogus historical leftover? It kind of looks that
>> >> way. But maybe that change should be done separately?
>> >
>> > I'm still not convinced that exposing close_files() to parallel
>> > 3rd-party accesses is safe in all cases, so this patch still needs
>> > more analysis.
>>
>> That is fine. I just wanted to post the latest version so we could
>> continue the discussion. Especially with comments etc.
>
> It's probably safe. I've spent today digging through the mess in
> fs/notify and kernel/bpf, and while I'm disgusted with both, at
> that point I believe that close_files() exposure is not going to
> create problems with either. And xchg() in there _is_ useless.

Then I will work on a cleaned up version.

> Said that, BPF "file iterator" stuff is potentially very unpleasant -
> it allows to pin a struct file found in any process' descriptor table
> indefinitely long. Temporary struct file references grabbed by procfs
> code, while unfortunate, are at least short-lived; with this stuff sky's
> the limit.
>
> I'm not happy about having that available, especially if it's a user-visible
> primitive we can't withdraw at zero notice ;-/
>
> What are the users of that thing and is there any chance to replace it
> with something saner? IOW, what *is* realistically called for each
> struct file by the users of that iterator?

The bpf guys are no longer Cc'd and they can probably answer better than
I.

In a previous conversation it was mentioned that task_iter was supposed
to be a high performance interface for getting proc like data out of the
kernel using bpf.

If so I think that handles the lifetime issues as bpf programs are
supposed to be short-lived and can not pass references anywhere.

On the flip side it raises the question did the BPF guys just make the
current layout of task_struct and struct file part of the linux kernel
user space ABI?

Eric