Re: Null-ptr-deref due to "sanitized pathwalk machinery (v4)"

From: Qian Cai
Date: Wed Mar 25 2020 - 09:22:05 EST




> On Mar 25, 2020, at 12:03 AM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>
> On Tue, Mar 24, 2020 at 11:24:01PM -0400, Qian Cai wrote:
>
>>> On Mar 24, 2020, at 10:13 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>>>
>>> On Tue, Mar 24, 2020 at 09:49:48PM -0400, Qian Cai wrote:
>>>
>>>> It does not catch anything at all with the patch,
>>>
>>> You mean, oops happens, but neither WARN_ON() is triggered?
>>> Lovely... Just to make sure: could you slap the same couple
>>> of lines just before
>>> if (unlikely(!d_can_lookup(nd->path.dentry))) {
>>> in link_path_walk(), just to check if I have misread the trace
>>> you've got?
>>>
>>> Does that (+ other two inserts) end up with
>>> 1) some of these WARN_ON() triggered when oops happens or
>>> 2) oops is happening, but neither WARN_ON() triggers or
>>> 3) oops not happening / becoming harder to hit?
>>
>> Only the one just before
>> if (unlikely(!d_can_lookup(nd->path.dentry))) {
>> In link_path_walk() will trigger.
>
>> [ 245.767202][ T5020] pathname = /var/run/nscd/socket
>
> Lovely. So
> * we really do get NULL nd->path.dentry there; I've not misread the
> trace.
> * on the entry into link_path_walk() nd->path.dentry is non-NULL.
> * *ALL* components should've been LAST_NORM ones
> * not a single symlink in sight, unless the setup is rather unusual
> * possibly not even a single mountpoint along the way (depending
> upon the userland used)
>
> And in the same loop we have
> if (likely(type == LAST_NORM)) {
> struct dentry *parent = nd->path.dentry;
> nd->flags &= ~LOOKUP_JUMPED;
> if (unlikely(parent->d_flags & DCACHE_OP_HASH)) {
> struct qstr this = { { .hash_len = hash_len }, .name = name };
> err = parent->d_op->d_hash(parent, &this);
> if (err < 0)
> return err;
> hash_len = this.hash_len;
> name = this.name;
> }
> }
> upstream of that thing. So NULL nd->path.dentry *there* would've oopsed.
> IOW, what we are hitting is walk_component() with non-NULL nd->path.dentry
> when we enter it, NULL being returned and nd->path.dentry becoming NULL
> by the time we return from walk_component().
>
> Could you post the results of
> stat / /var /var/run /var/run/nscd /var/run/nscd/socket

The file is gone after a successful boot,

# stat / /var /var/run /var/run/nscd /var/run/nscd/socket
File: /
Size: 244 Blocks: 0 IO Block: 65536 directory
Device: fe00h/65024d Inode: 128 Links: 17
Access: (0555/dr-xr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2020-03-24 14:21:27.112559236 -0400
Modify: 2020-03-24 14:21:25.840486593 -0400
Change: 2020-03-24 14:21:25.840486593 -0400
Birth: -
File: /var
Size: 4096 Blocks: 8 IO Block: 65536 directory
Device: fe00h/65024d Inode: 133 Links: 21
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2018-08-12 05:57:57.000000000 -0400
Modify: 2020-03-23 21:29:31.087264900 -0400
Change: 2020-03-23 21:29:31.087264900 -0400
Birth: -
File: /var/run -> ../run
Size: 6 Blocks: 0 IO Block: 65536 symbolic link
Device: fe00h/65024d Inode: 143 Links: 1
Access: (0777/lrwxrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2020-03-24 17:34:11.865030724 -0400
Modify: 2020-03-23 17:16:40.573974805 -0400
Change: 2020-03-23 17:16:40.573974805 -0400
Birth: -
stat: cannot stat '/var/run/nscd': No such file or directory
stat: cannot stat '/var/run/nscd/socket': No such file or directory

> after the boot with working kernel? Also, is that "hit on every boot" or
> stochastic? If it's the latter, I'd like to see the output of the same
> thing on a successful boot of the same kernel, if possible...

It does not hit every time, so I used a cron job,

@reboot sleep 180; systemctl reboot

It has always hit it within a hour so far.