Re: [RFC] fixing the UML failure root cause

From: Linus Torvalds
Date: Fri Oct 14 2011 - 00:47:08 EST


On Thu, Oct 13, 2011 at 8:40 PM, Andrew Lutomirski <luto@xxxxxxx> wrote:
>
> How does that work?  The tricky case is when one of those three words
> spans a page boundary if the access to the first page is valid, but
> the access to the second page is not.  When that happens, if we report
> the fault as coming from the first page, then UML is likely to get
> think the fault was spurious and enter an infinite loop.

Hmm. Gaah, I just find that memcpy loop disgusting.

We already have that ugly "uaccess_error" crap in handle_exception(),
we might as well do something like the attached and just say "hey, now
you can catch the page fault information for a get_user/put_user
fault".

Isn't that much nicer?

You don't even have to check each word, you can just take the last
exception info from the thread-info.

Linus
arch/x86/include/asm/thread_info.h | 2 ++
arch/x86/mm/fault.c | 6 +++++-
2 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index a1fe5c127b52..e8d245febfae 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -41,6 +41,8 @@ struct thread_info {
__u8 supervisor_stack[0];
#endif
int uaccess_err;
+ int uaccess_error_code;
+ unsigned long uaccess_addr;
};

#define INIT_THREAD_INFO(tsk) \
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 0d17c8c50acd..bbbee6e6a95b 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -628,8 +628,12 @@ no_context(struct pt_regs *regs, unsigned long error_code,
int sig;

/* Are we prepared to handle this kernel fault? */
- if (fixup_exception(regs))
+ if (fixup_exception(regs)) {
+ struct thread_info *ti = current_thread_info();
+ ti->uaccess_error_code = error_code;
+ ti->uaccess_addr = address;
return;
+ }

/*
* 32-bit: