Re: fs: NULL deref in atime_needs_update

From: Al Viro
Date: Tue Feb 23 2016 - 13:18:04 EST


On Tue, Feb 23, 2016 at 04:34:59PM +0100, Dmitry Vyukov wrote:
> The crash:
>
> [ 8095.048336] ------------[ cut here ]------------
> [ 8095.048864] WARNING: CPU: 3 PID: 5532 at fs/namei.c:1672
> should_follow_link.part.25+0x55/0x21a()

NULL or ERR_PTR() passed as inode to should_follow_link().

> [ 8095.060526] BUG: unable to handle kernel NULL pointer dereference
> at 000000000000000c

OK, NULL inode it is. And that was in do_last().

> And here is my inode.o:
> https://gist.githubusercontent.com/dvyukov/27ec88c2c1a83c2e0f38/raw/2514d0ddd7720a978e6a2f67c2dcb391046ce0e7/gistfile1.txt
>
> This can be reproduced following the instructions here:
> https://github.com/google/syzkaller/wiki/How-to-execute-syzkaller-programs
> Using this command line:
> # ./syz-execprog -cover=0 -procs=60 -repeat=0 prog
> with the following program:
> https://gist.githubusercontent.com/dvyukov/fc026f36f9f76d1a440b/raw/0e133afa99eb7de45880523fbd48256cd2ae4a6c/gistfile1.txt
> (requires CONFIG_USER_NS=y). The crash triggers after hours of execution.

Joy... Another interesting question is whether we'd been in RCU mode at
the time of that should_follow_link(). The thing is, we could've come there
either from
if (!(open_flag & O_CREAT)) {
if (nd->last.name[nd->last.len])
nd->flags |= LOOKUP_FOLLOW | LOOKUP_DIRECTORY;
/* we _can_ be in RCU mode here */
error = lookup_fast(nd, &path, &inode, &seq);
if (likely(!error))
goto finish_lookup;

or from
BUG_ON(nd->flags & LOOKUP_RCU);
inode = d_backing_inode(path.dentry);
seq = 0; /* out of RCU mode, so the value doesn't matter */
if (unlikely(d_is_negative(path.dentry))) {
path_to_nameidata(&path, nd);
return -ENOENT;
}
finish_lookup:

In the latter case we are holding a reference to path.dentry, so d_is_negative
ought to be reliable and refering to the same backing inode. In the former,
if we leave still in RCU mode, we went through
*inode = d_backing_inode(dentry);
negative = d_is_negative(dentry);
[check that dentry->d_seq is still unchanged]
...
[check that negative is false]
and that guarantees that both inode and negative had been taken while dentry
remained stable, so we couldn't pass through the second check with NULL
inode. And returning 0 in non-RCU mode means that we go through
if (unlikely(d_is_negative(dentry))) {
dput(dentry);
return -ENOENT;
}
path->mnt = mnt;
path->dentry = dentry;
err = follow_managed(path, nd);
if (likely(!err))
*inode = d_backing_inode(path->dentry);
return err;
with dentry pinned, so NULL inode here is also bloody odd - we have positive
dentry that will remain positive through all that and somehow follow_managed()
(in non-RCU mode) gets us a negative one. Now, follow_managed() either
leaves path->dentry unchanged (and keeps it pinned through all of that), or
does
struct vfsmount *mounted = lookup_mnt(path);
if (mounted) {
dput(path->dentry);
if (need_mntput)
mntput(path->mnt);
path->mnt = mounted;
path->dentry = dget(mounted->mnt_root);
(and ->mnt_root should never be negative), or goes into follow_autmount(),
where we either leave the damn thing unchanged or hit
path->dentry = dget(mnt->mnt_root);
... or we have ->d_automount() instance doing something nasty to it. Damn.

OK, a look through the instances shows that only autofs4 one might modify
path->dentry:
struct dentry *new = d_lookup(parent, &dentry->d_name);
if (!new)
return NULL;
ino = autofs4_dentry_ino(new);
ino->last_used = jiffies;
dput(path->dentry);
path->dentry = new;
in autofs4_mountpoint_changed()... I doubt that this is the cause here,
but let's slap WARN_ON(d_is_negative(new)) there.

The thing is, I *do* see one bug around should_follow_link(), but it would
manifest differently. So you must be hitting something else there, to get
that NULL inode... Could you try to reproduce it with the patch below
and see what warnings trigger? I'll try to reproduce it as well, but since
you already have a working setup...

diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c
index c6d7d3d..86f81e3 100644
--- a/fs/autofs4/root.c
+++ b/fs/autofs4/root.c
@@ -323,6 +323,7 @@ static struct dentry *autofs4_mountpoint_changed(struct path *path)
struct dentry *new = d_lookup(parent, &dentry->d_name);
if (!new)
return NULL;
+ WARN_ON(d_is_negative(new));
ino = autofs4_dentry_ino(new);
ino->last_used = jiffies;
dput(path->dentry);
diff --git a/fs/namei.c b/fs/namei.c
index f624d13..ac00bcb 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1209,6 +1209,7 @@ static int follow_managed(struct path *path, struct nameidata *nd)
/* Handle an automount point */
if (managed & DCACHE_NEED_AUTOMOUNT) {
ret = follow_automount(path, nd, &need_mntput);
+ WARN_ON(d_is_negative(path->dentry));
if (ret < 0)
break;
continue;
@@ -1613,8 +1614,10 @@ unlazy:
path->mnt = mnt;
path->dentry = dentry;
err = follow_managed(path, nd);
- if (likely(!err))
+ if (likely(!err)) {
*inode = d_backing_inode(path->dentry);
+ WARN_ON(!inode);
+ }
return err;

need_lookup:
@@ -1712,6 +1715,17 @@ static inline int should_follow_link(struct nameidata *nd, struct path *link,
return 0;
if (!follow)
return 0;
+ /* make sure that d_is_symlink above matches inode */
+ if (nd->flags & LOOKUP_RCU) {
+ if (read_seqcount_retry(&link->dentry->d_seq, seq)) {
+ WARN_ON(1); // just as way to report hitting
+ // that path; it's OK to get
+ // here, in the final variant
+ // WARN_ON will disappear.
+ return -ECHILD;
+ }
+ }
+ WARN_ON(!inode); // now, _that_ should not happen.
return pick_link(nd, link, inode, seq);
}

@@ -3273,6 +3287,10 @@ opened:
goto exit_fput;
}
out:
+ if (unlikely(error > 0)) {
+ WARN_ON(1);
+ error = -EINVAL;
+ }
if (got_write)
mnt_drop_write(nd->path.mnt);
path_put(&save_parent);