Re: [PATCH] Convert filldir[64]() from __put_user() to unsafe_put_user()

From: Linus Torvalds
Date: Sun Oct 06 2019 - 21:17:30 EST


On Sun, Oct 6, 2019 at 5:04 PM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
>
> All my alpha, sparc64, and xtensa tests pass with the attached patch
> applied on top of v5.4-rc2. I didn't test any others.

Okay... I really wish my guess had been wrong.

Because fixing filldir64 isn't the problem. I can come up with
multiple ways to avoid the unaligned issues if that was the problem.

But it does look to me like the fundamental problem is that unaligned
__put_user() calls might just be broken on alpha (and likely sparc
too). Because that looks to be the only difference between the
__copy_to_user() approach and using unsafe_put_user() in a loop.

Now, I should have handled unaligned things differently in the first
place, and in that sense I think commit 9f79b78ef744 ("Convert
filldir[64]() from __put_user() to unsafe_put_user()") really is
non-optimal on architectures with alignment issues.

And I'll fix it.

But at the same time, the fact that "non-optimal" turns into "doesn't
work" is a fairly nasty issue.

> I'll (try to) send you some disassembly next.

Thanks, verified.

The "ra is at filldir64+0x64/0x320" is indeed right at the return
point of the "jsr verify_dirent_name".

But the problem isn't there - that's just left-over state. I'm pretty
sure that function worked fine, and returned.

The problem is that "pc is at 0x4" and the page fault that then
happens at that address as a result.

And that seems to be due to this:

8c0: 00 00 29 2c ldq_u t0,0(s0)
8c4: 07 00 89 2c ldq_u t3,7(s0)
8c8: 03 04 e7 47 mov t6,t2
8cc: c1 06 29 48 extql t0,s0,t0
8d0: 44 0f 89 48 extqh t3,s0,t3
8d4: 01 04 24 44 or t0,t3,t0
8d8: 00 00 22 b4 stq t0,0(t1)

that's the "get_unaligned((type *)src)" (the first six instructions)
followed by the "unsafe_put_user()" done with a single "stq". That's
the guts of the unsafe_copy_loop() as part of
unsafe_copy_dirent_name()

And what I think happens is that it is writing to user memory that is

(a) unaligned

(b) not currently mapped in user space

so then the do_entUna() function tries to handle the unaligned trap,
but then it takes an exception while doing that (due to the unmapped
page), and then something in that nested exception mess causes it to
mess up badly and corrupt the register contents on stack, and it
returns with garbage in 'pc', and then you finally die with that

Unable to handle kernel paging request at virtual address 0000000000000004
pc is at 0x4

thing.

And yes, I'll fix that name copy loop in filldir to align the
destination first, *but* if I'm right, it means that something like
this should also likely cause issues:

#define _GNU_SOURCE
#include <unistd.h>
#include <sys/mman.h>

int main(int argc, char **argv)
{
void *mymap;
uid_t *bad_ptr = (void *) 0x01;

/* Create unpopulated memory area */
mymap = mmap(NULL, 16384, PROT_READ | PROT_WRITE, MAP_PRIVATE
| MAP_ANONYMOUS, -1, 0);

/* Unaligned uidpointer in that memory area */
bad_ptr = mymap+1;

/* Make the kernel do put_user() on it */
return getresuid(bad_ptr, bad_ptr+1, bad_ptr+2);
}

because that simple user mode program should cause that same "page
fault on unaligned put_user()" behavior as far as I can tell.

Mind humoring me and trying that on your alpha machine (or emulator,
or whatever)?

Linus