Re: [NFS] Re: 2.3.30 linuxNFS import is broken (Screwed up NFS/RPC credentials)

Trond Myklebust (trond.myklebust@fys.uio.no)
22 Dec 1999 16:13:59 +0100


>>>>> " " == Neil Brown <neilb@cse.unsw.edu.au> writes:

> On Tuesday December 21, trond.myklebust@fys.uio.no wrote:
>>
>> The most practical way of implementing this policy is therefore
>> to hide the RPC auth in the file descriptor structure (I use
>> the private data field), and pass that info via the file
>> pointer to readpage/writepage/whatever else needs it.

> There are some really interesting issues here...

> When the agent needs to access data on the server it needs to
> have credentials to give to the server to authorise the access.
> Currently in Linux these are the credentials of the process
> doing the access, though Trond wants to change this so
> credentials get cached in the file handle at open time (which
> is a good thing). However it is worth noting that the
> credentials used to fill or flush the cache do not *need* to
> reflect the principal who is requesting the access. This is
> particularly apparent with NFSv3 and unstable writes:

> Suppose I open a file and write bytes 0-127. Then you open
> the file and write bytes 128-511. Both of these go to the
> server as UNSTABLE writes, presumably with my and your
> credentials respectively. But suppose the server restarts
> before we do a commit - the data has to be written again.
> Whose credential is used? Presumably the credential of
> whoever does the fsync/close which forced the commit. But in
> a very real sense it doesn't matter as long as it is a valid
> credential.

Currently in the NFSv3 patches, the write/commit stuff reuses the
credentials of whoever is actually doing the write. Furthermore, if 2
processes try to write to the same 'cluster' (the struct cluster just
defines an aligned region of 16 pagecache pages, and is the basic unit
used for deciding when to 'commit' data) then we require the write
from the first process to be committed to disk before the second is
allowed to write to the page cache.

This can fail in the case of shared mmaps where you cannot separate
who writes what into the page cache, but otherwise it means that the
correct user authentication is used for each write.

> So, what the Linux NFS client could do is:
> Store credentials in the file handle and open time. Copy
> them into a per-inode cache on each read/write request. Use
> any appropriate credential out of the per-inode cache when
> read/writing data to/from the server.

> Most of the time there would only be one credential (or
> credential reference) in each inode. Possibly you would only
> need to store three: The most recent credential with read
> access, the most recent with write access and the most recent
> with "ownership" access (though that last one might not be
> needed).

This seems to me to be a much uglier solution than storing the creds
in the struct file. You have the problem of which choice of
credentials to make, taking into account the fact that in the more
sophisticated schemes credentials may expire. You also want to take
into account the fact that uid/gid/groups info may not be properly
synchronized between server and client. Imagine if say the group
'mail' on the server happened to have the same gid as 'user' on the
client. This sort of thing does happen in a mixed UNIX environment.

In particular for writes, using a common credential is a bad idea that
will open up a lot of potential security holes. There is nothing in
Al's patch proposal that would make this sort of thing necessary (for
the case of writes).

> Thinking more about keeping UNSTABLE writes in the client
> cache, it occurs to me that there would need to be some way for
> memory pressure to encourage the client to commit cached writes
> so that memory can be freed up, much like Stephen Tweedie wants
> for his journalling filesystem? Is there really a similarity
> here or am I imagining it? Trond: does your NFSv3 code allow
> memory pressure to force commits of unstably written data and
> if so, what credential do you use? (another case for storing
> the credential in the inode?)

It will be easy to implement when the necessary hooks are provided by
the memory management stuff. Currently we only provide a soft and a
hard limit to the number of pending NFS writes. When we reach the soft
limit, we start the asynchronous flushing out of pending writes. When
we reach the hard limit, we force any attempts at new request
allocations to sleep until at least one existing write request has
been flushed out and freed.

For the NFSv3 patched kernel, credentials are at the moment stored in
the struct file. The struct cluster then stores the pointer to the
struct file, and when being flushed, it will pick up the stored
credential from the struct file.

Cheers,
Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/