Fwd: [PATCH] powerpc/ptrace: Fix buffer overflow when handling PTRACE_PEEKUSER and PTRACE_POKEUSER

From: Ariel Miculas
Date: Wed Sep 27 2023 - 06:23:04 EST


---------- Forwarded message ---------
From: Ariel Miculas <ariel.miculas@xxxxxxxxx>
Date: Thu, Jun 9, 2022 at 1:31 PM
Subject: Fwd: [PATCH] powerpc/ptrace: Fix buffer overflow when
handling PTRACE_PEEKUSER and PTRACE_POKEUSER
To: <christophe.leroy@xxxxxxxxxx>



Forwarded Conversation
Subject: [PATCH] powerpc/ptrace: Fix buffer overflow when handling
PTRACE_PEEKUSER and PTRACE_POKEUSER
------------------------

From: Ariel Miculas <ariel.miculas@xxxxxxxxx>
Date: Wed, Jun 1, 2022 at 12:36 PM
To: <security@xxxxxxxxxx>
Cc: Ariel Miculas <ariel.miculas@xxxxxxxxx>


This fixes the gdbserver issue on PPC32 described here:
Link: https://linuxppc-dev.ozlabs.narkive.com/C46DRek4/debug-problems-on-ppc-83xx-target-due-to-changed-struct-task-struct

On PPC32, the user space code considers the floating point to be an
array of unsigned int (32 bits) - the index passed in is based on
this assumption.

fp_state is a matrix consisting of 32 lines
/* FP and VSX 0-31 register set /
struct thread_fp_state {
u64 fpr[32][TS_FPRWIDTH] attribute((aligned(16)));
u64 fpscr; / Floating point status */
};

On PPC32, PT_FPSCR is defined as: (PT_FPR0 + 2*32 + 1)

This means the fpr index validation allows a range from 0 to 65, leading
to out-of-bounds array access. This ends up corrupting
threads_struct->state, which holds the state of the task. Thus, threads
incorrectly transition from a running state to a traced state and get
stuck in that state.

On PPC32 it's ok to assume that TS_FPRWIDTH is 1 because CONFIG_VSX is
PPC64 specific. TS_FPROFFSET can be safely ignored, thus the assumption
that fpr is an array of 32 elements of type u64 holds true.

Solution taken from arch/powerpc/kernel/ptrace32.c
---
arch/powerpc/kernel/ptrace/ptrace-fpu.c | 31 +++++++++++++++++++++++--
1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/ptrace/ptrace-fpu.c
b/arch/powerpc/kernel/ptrace/ptrace-fpu.c
index 5dca19361316..4351f2bcd12d 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-fpu.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-fpu.c
@@ -6,9 +6,16 @@

#include "ptrace-decl.h"

+#ifdef CONFIG_PPC32
+/* Macros to workout the correct index for the FPR in the thread struct */
+#define FPRNUMBER(i) (((i) - PT_FPR0) >> 1)
+#define FPRHALF(i) (((i) - PT_FPR0) & 1)
+#define FPRINDEX(i) TS_FPRWIDTH * FPRNUMBER(i) * 2 + FPRHALF(i)
+#endif
+
int ptrace_get_fpr(struct task_struct *child, int index, unsigned long *data)
{
-#ifdef CONFIG_PPC_FPU_REGS
+#if defined(CONFIG_PPC_FPU_REGS) && !defined(CONFIG_PPC32)
unsigned int fpidx = index - PT_FPR0;
#endif

@@ -16,11 +23,21 @@ int ptrace_get_fpr(struct task_struct *child, int
index, unsigned long *data)
return -EIO;

#ifdef CONFIG_PPC_FPU_REGS
+#ifdef CONFIG_PPC32
+ /*
+ * the user space code considers the floating point
+ * to be an array of unsigned int (32 bits) - the
+ * index passed in is based on this assumption.
+ */
+ *data = ((unsigned int *)child->thread.fp_state.fpr)
+ [FPRINDEX(index)];
+#else
flush_fp_to_thread(child);
if (fpidx < (PT_FPSCR - PT_FPR0))
memcpy(data, &child->thread.TS_FPR(fpidx), sizeof(long));
else
*data = child->thread.fp_state.fpscr;
+#endif
#else
*data = 0;
#endif
@@ -30,7 +47,7 @@ int ptrace_get_fpr(struct task_struct *child, int
index, unsigned long *data)

int ptrace_put_fpr(struct task_struct *child, int index, unsigned long data)
{
-#ifdef CONFIG_PPC_FPU_REGS
+#if defined(CONFIG_PPC_FPU_REGS) && !defined(CONFIG_PPC32)
unsigned int fpidx = index - PT_FPR0;
#endif

@@ -38,11 +55,21 @@ int ptrace_put_fpr(struct task_struct *child, int
index, unsigned long data)
return -EIO;

#ifdef CONFIG_PPC_FPU_REGS
+#ifdef CONFIG_PPC32
+ /*
+ * the user space code considers the floating point
+ * to be an array of unsigned int (32 bits) - the
+ * index passed in is based on this assumption.
+ */
+ ((unsigned int *)child->thread.fp_state.fpr)
+ [FPRINDEX(index)] = data;
+#else
flush_fp_to_thread(child);
if (fpidx < (PT_FPSCR - PT_FPR0))
memcpy(&child->thread.TS_FPR(fpidx), &data, sizeof(long));
else
child->thread.fp_state.fpscr = data;
+#endif
#endif

return 0;
--
2.36.1



----------
From: Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx>
Date: Wed, Jun 1, 2022 at 12:44 PM
To: Ariel Miculas <ariel.miculas@xxxxxxxxx>
Cc: <security@xxxxxxxxxx>


On Wed, Jun 01, 2022 at 12:35:09PM +0300, Ariel Miculas wrote:
> This fixes the gdbserver issue on PPC32 described here:
> Link: https://linuxppc-dev.ozlabs.narkive.com/C46DRek4/debug-problems-on-ppc-83xx-target-due-to-changed-struct-task-struct

If this is a public issue, just post to the public mailing list for the
subsystem and the developers and maintainers there can help you get this
merged properly.

>
> On PPC32, the user space code considers the floating point to be an
> array of unsigned int (32 bits) - the index passed in is based on
> this assumption.
>
> fp_state is a matrix consisting of 32 lines
> /* FP and VSX 0-31 register set /
> struct thread_fp_state {
> u64 fpr[32][TS_FPRWIDTH] attribute((aligned(16)));
> u64 fpscr; / Floating point status */
> };
>
> On PPC32, PT_FPSCR is defined as: (PT_FPR0 + 2*32 + 1)
>
> This means the fpr index validation allows a range from 0 to 65, leading
> to out-of-bounds array access. This ends up corrupting
> threads_struct->state, which holds the state of the task. Thus, threads
> incorrectly transition from a running state to a traced state and get
> stuck in that state.
>
> On PPC32 it's ok to assume that TS_FPRWIDTH is 1 because CONFIG_VSX is
> PPC64 specific. TS_FPROFFSET can be safely ignored, thus the assumption
> that fpr is an array of 32 elements of type u64 holds true.
>
> Solution taken from arch/powerpc/kernel/ptrace32.c

Note, you did not properly sign-off on this commit, so it couldn't be
applied anyway :(

thanks,

greg k-h



----------
From: Ariel Miculas <ariel.miculas@xxxxxxxxx>
Date: Wed, Jun 1, 2022 at 12:47 PM
To: Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx>


I wasn't sure about the security impact of this issue, that's why I
sent it to this list.
I will go ahead and send it to public lists if it's not a security
sensitive issue.


----------
From: Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx>
Date: Wed, Jun 1, 2022 at 1:00 PM
To: Ariel Miculas <ariel.miculas@xxxxxxxxx>
Cc: <security@xxxxxxxxxx>


Please do not drop the list, that's a bit harsh for everyone else on
it, now added back.

On Wed, Jun 01, 2022 at 12:47:34PM +0300, Ariel Miculas wrote:
> I wasn't sure about the security impact of this issue, that's why I sent it
> to this list.
> I will go ahead and send it to public lists if it's not a security
> sensitive issue.

I do not know if this is a security issue, sorry, that wasn't very
obvious from your patch submission. And you were pointing at a very
very old email thread, so are you sure nothing has happened there since
then?

thanks,

greg k-h


----------
From: Ariel Miculas <ariel.miculas@xxxxxxxxx>
Date: Wed, Jun 1, 2022 at 2:40 PM
To: Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx>
Cc: <security@xxxxxxxxxx>


Sorry for dropping the list.
To give a bit of context: I was working on the gdbserver issue (the
one described in the link) on PPC32.
I was able to find the root cause: a buffer overflow in thread_struct,
more specifically in the field fpr in the structure thread_fp_state.
I've validated this fix, but for an older kernel version: 4.14. I did
not reproduce nor test the issue on the latest kernel version.
However, by looking at the code and also the git logs, I've noticed
that this is still an issue in the latest kernel version.
To be on the safe side, I've emailed this list first because this
might also be a security issue, unfortunately I do not have enough
expertise to say for sure whether or not it is a security issue.
Why I suspected it might be a security issue: this buffer overflow
overwrites fields in the task_struct of other threads/processes.
In the issue I was facing, a field that was overwritten was the state
field, leading to a neighboring thread transitioning from the 'S'
state to the 't' state, leading to that specific thread to no longer
be scheduled.
Maybe this could lead to a DOS since you could stop another process
from being scheduled.
Overwriting other fields may lead to some privilege escalation issue,
but this is just a wild guess.
So I need some help on this matter.

Thanks,
Ariel


----------
From: Eric W. Biederman <ebiederm@xxxxxxxxxxxx>
Date: Wed, Jun 1, 2022 at 6:04 PM
To: Ariel Miculas <ariel.miculas@xxxxxxxxx>
Cc: <security@xxxxxxxxxx>


Ariel Miculas <ariel.miculas@xxxxxxxxx> writes:

> This fixes the gdbserver issue on PPC32 described here:
> Link: https://linuxppc-dev.ozlabs.narkive.com/C46DRek4/debug-problems-on-ppc-83xx-target-due-to-changed-struct-task-struct
>
> On PPC32, the user space code considers the floating point to be an
> array of unsigned int (32 bits) - the index passed in is based on
> this assumption.
>
> fp_state is a matrix consisting of 32 lines
> /* FP and VSX 0-31 register set /
> struct thread_fp_state {
> u64 fpr[32][TS_FPRWIDTH] attribute((aligned(16)));
> u64 fpscr; / Floating point status */
> };
>
> On PPC32, PT_FPSCR is defined as: (PT_FPR0 + 2*32 + 1)
>
> This means the fpr index validation allows a range from 0 to 65, leading
> to out-of-bounds array access. This ends up corrupting
> threads_struct->state, which holds the state of the task. Thus, threads
> incorrectly transition from a running state to a traced state and get
> stuck in that state.
>
> On PPC32 it's ok to assume that TS_FPRWIDTH is 1 because CONFIG_VSX is
> PPC64 specific. TS_FPROFFSET can be safely ignored, thus the assumption
> that fpr is an array of 32 elements of type u64 holds true.
>
> Solution taken from arch/powerpc/kernel/ptrace32.c

I have a little familiarity with ptrace, so I took a quick look,
and I am confused.

Doesn't sizeof(long) == sizeof(int) == 4 on 32bit power pc?
I don't understand why you need a big comment about how
the index is computed when that isn't changing.

Doesn't the 32bit code need flush_fp_to_thread(child)?

Don't the overflow checks for an index out of bounds need to be
preserved?

I am a bit lost about what makes the TS_FPR calculation wrong on 32bit?

Eric

----------
From: Ariel Miculas <ariel.miculas@xxxxxxxxx>
Date: Wed, Jun 1, 2022 at 6:48 PM
To: Eric W. Biederman <ebiederm@xxxxxxxxxxxx>
Cc: <security@xxxxxxxxxx>


> Doesn't sizeof(long) == sizeof(int) == 4 on 32bit power pc?
> I don't understand why you need a big comment about how
> the index is computed when that isn't changing.
I copied the comment from arch/powerpc/kernel/ptrace/ptrace32.c

> Doesn't the 32bit code need flush_fp_to_thread(child)?
Yes, you are right, this is my mistake.

> Don't the overflow checks for an index out of bounds need to be
> preserved?
There is already a check at the beginning of the function:
if (index > PT_FPSCR)
return -EIO;

The other check was confusing for me also, but basically what happens is this:
for the last valid index, the field fpscr in the structure
thread_fp_state is returned/updated.

On PPC32 bit, this check is not needed because the code is
functionally equivalent:
accessing fpr[32] will lead to accessing the next field in the
structure, i.e. the field fpscr (the same thing that if/else condition
is accomplishing).
This happens because TS_FPRWIDTH is 1 on PPC32.

> I am a bit lost about what makes the TS_FPR calculation wrong on 32bit?
TS_FPR(i) uses i for indexing fp_state.fpr[i][TS_FPROFFSET]
But fpr is defined as follows:
u64 fpr[32][TS_FPRWIDTH] __attribute__((aligned(16)));

It is an array of size 32.
However, fpidx is validated like this:
if (fpidx < (PT_FPSCR - PT_FPR0))

And PT_FPSCR is defined as:
#define PT_FPSCR (PT_FPR0 + 2*32 + 1)

So PT_FPSCR - PT_FPR0 is 2*32+1 = 65
Meaning fpidx has a maximum value of 64.
fp_state.fpr[64][TS_FPROFFSET] is an out-of-range memory access.


----------
From: Ariel Miculas <ariel.miculas@xxxxxxxxx>
Date: Wed, Jun 1, 2022 at 6:57 PM
To: <security@xxxxxxxxxx>
Cc: Ariel Miculas <ariel.miculas@xxxxxxxxx>


This fixes the gdbserver issue on PPC32 described here:
Link: https://linuxppc-dev.ozlabs.narkive.com/C46DRek4/debug-problems-on-ppc-83xx-target-due-to-changed-struct-task-struct

On PPC32, the user space code considers the floating point to be an
array of unsigned int (32 bits) - the index passed in is based on
this assumption.

fp_state is a matrix consisting of 32 lines
/* FP and VSX 0-31 register set /
struct thread_fp_state {
u64 fpr[32][TS_FPRWIDTH] attribute((aligned(16)));
u64 fpscr; / Floating point status */
};

On PPC32, PT_FPSCR is defined as: (PT_FPR0 + 2*32 + 1)

This means the fpr index validation allows a range from 0 to 65, leading
to out-of-bounds array access. This ends up corrupting
threads_struct->state, which holds the state of the task. Thus, threads
incorrectly transition from a running state to a traced state and get
stuck in that state.

On PPC32 it's ok to assume that TS_FPRWIDTH is 1 because CONFIG_VSX is
PPC64 specific. TS_FPROFFSET can be safely ignored, thus the assumption
that fpr is an array of 32 elements of type u64 holds true.

Solution taken from arch/powerpc/kernel/ptrace32.c

Signed-off-by: Ariel Miculas <ariel.miculas@xxxxxxxxx>
---
arch/powerpc/kernel/ptrace/ptrace-fpu.c | 31 +++++++++++++++++++++++--
1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/ptrace/ptrace-fpu.c
b/arch/powerpc/kernel/ptrace/ptrace-fpu.c
index 5dca19361316..93695abbbdfb 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-fpu.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-fpu.c
@@ -6,9 +6,16 @@

#include "ptrace-decl.h"

+#ifdef CONFIG_PPC32
+/* Macros to workout the correct index for the FPR in the thread struct */
+#define FPRNUMBER(i) (((i) - PT_FPR0) >> 1)
+#define FPRHALF(i) (((i) - PT_FPR0) & 1)
+#define FPRINDEX(i) TS_FPRWIDTH * FPRNUMBER(i) * 2 + FPRHALF(i)
+#endif
+
int ptrace_get_fpr(struct task_struct *child, int index, unsigned long *data)
{
-#ifdef CONFIG_PPC_FPU_REGS
+#if defined(CONFIG_PPC_FPU_REGS) && !defined(CONFIG_PPC32)
unsigned int fpidx = index - PT_FPR0;
#endif

@@ -17,10 +24,20 @@ int ptrace_get_fpr(struct task_struct *child, int
index, unsigned long *data)

#ifdef CONFIG_PPC_FPU_REGS
flush_fp_to_thread(child);
+#ifdef CONFIG_PPC32
+ /*
+ * the user space code considers the floating point
+ * to be an array of unsigned int (32 bits) - the
+ * index passed in is based on this assumption.
+ */
+ *data = ((unsigned int *)child->thread.fp_state.fpr)
+ [FPRINDEX(index)];
+#else
if (fpidx < (PT_FPSCR - PT_FPR0))
memcpy(data, &child->thread.TS_FPR(fpidx), sizeof(long));
else
*data = child->thread.fp_state.fpscr;
+#endif
#else
*data = 0;
#endif
@@ -30,7 +47,7 @@ int ptrace_get_fpr(struct task_struct *child, int
index, unsigned long *data)

int ptrace_put_fpr(struct task_struct *child, int index, unsigned long data)
{
-#ifdef CONFIG_PPC_FPU_REGS
+#if defined(CONFIG_PPC_FPU_REGS) && !defined(CONFIG_PPC32)
unsigned int fpidx = index - PT_FPR0;
#endif

@@ -39,10 +56,20 @@ int ptrace_put_fpr(struct task_struct *child, int
index, unsigned long data)

#ifdef CONFIG_PPC_FPU_REGS
flush_fp_to_thread(child);
+#ifdef CONFIG_PPC32
+ /*
+ * the user space code considers the floating point
+ * to be an array of unsigned int (32 bits) - the
+ * index passed in is based on this assumption.
+ */
+ ((unsigned int *)child->thread.fp_state.fpr)
+ [FPRINDEX(index)] = data;
+#else
--
2.36.1



----------
From: Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx>
Date: Thu, Jun 2, 2022 at 9:27 AM
To: Ariel Miculas <ariel.miculas@xxxxxxxxx>
Cc: <security@xxxxxxxxxx>


THis type of #ifdef in .c code is unmaintainable over time. Can you put
it properly in a .h file somehow to make it simpler and easier to
understand and support?

thanks,

greg k-h


----------
From: Ariel Miculas <ariel.miculas@xxxxxxxxx>
Date: Thu, Jun 2, 2022 at 12:24 PM
To: Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx>
Cc: <security@xxxxxxxxxx>


I will send it to you shortly, but I will switch to my corporate email address.


----------
From: Ariel Miculas <ariel.miculas@xxxxxxxxx>
Date: Mon, Jun 6, 2022 at 7:45 PM
To: <ariel.miculas@xxxxxxxxxx>, <mpe@xxxxxxxxxxxxxx>