Re: [BUG] Boot crash on v6.7-rc2

From: Gustavo A. R. Silva
Date: Fri Nov 24 2023 - 10:52:56 EST




On 11/24/23 09:28, Gustavo A. R. Silva wrote:


On 11/24/23 04:24, Joey Gouly wrote:
Hi all,

I just hit a boot crash on v6.7-rc2 (arm64, FVP model):

[..]

Checking `struct neighbour`:

    struct neighbour {
        struct neighbour __rcu    *next;
        struct neigh_table    *tbl;
    .. fields ..
        u8            primary_key[0];
    } __randomize_layout;

Due to the `__randomize_layout`, `primary_key` field is being placed before `tbl` (actually it's the same address since it's a 0 length array). That means the memcpy() corrupts the tbl pointer.

I think I just got unlucky with my CONFIG_RANDSTRUCT seed (I can provide it if needed), it doesn't look as if it's a new issue.

It seems the issue is caused by this change that was recently added to -rc2:

commit 1ee60356c2dc ("gcc-plugins: randstruct: Only warn about true flexible arrays")

Previously, one-element and zero-length arrays were treated as true flexible arrays
(however, they are "fake" flex arrays), and __randomize_layout would leave them
untouched at the end of the struct; the same for proper C99 flex-array members. But
after the commit above, that's no longer the case: Only C99 flex-array members will
behave correctly (remaining untouched at end of the struct), and the other two types
of arrays will be randomized.

Kees,

I think we should complement the changes in commit 1ee60356c2dc with the following
update:

diff --git a/scripts/gcc-plugins/randomize_layout_plugin.c b/scripts/gcc-plugins/randomize_layout_plugin.c
index 910bd21d08f4..746ff2d272f2 100644
--- a/scripts/gcc-plugins/randomize_layout_plugin.c
+++ b/scripts/gcc-plugins/randomize_layout_plugin.c
@@ -339,8 +339,7 @@ static int relayout_struct(tree type)

/*
* enforce that we don't randomize the layout of the last
- * element of a struct if it's a 0 or 1-length array
- * or a proper flexible array
+ * element of a struct if it's a proper flexible array
*/
if (is_flexible_array(newtree[num_fields - 1])) {
has_flexarray = true;

--
Gustavo



I couldn't reproduce directly on v6.6 (the offsets for `tbl` and `primary_key` didn't overlap).
However I tried changing the zero-length-array to a flexible one:

    +    DECLARE_FLEX_ARRAY(u8, primary_key);
    +    u8        primary_key[0];

Then the field offsets ended up overlapping, and I also got the same crash on v6.6.

The right approach is to transform the zero-length array into a C99 flex-array member,
like this:

diff --git a/include/net/neighbour.h b/include/net/neighbour.h
index 07022bb0d44d..0d28172193fa 100644
--- a/include/net/neighbour.h
+++ b/include/net/neighbour.h
@@ -162,7 +162,7 @@ struct neighbour {
        struct rcu_head         rcu;
        struct net_device       *dev;
        netdevice_tracker       dev_tracker;
-       u8                      primary_key[0];
+       u8                      primary_key[];
 } __randomize_layout;

 struct neigh_ops {

--
Gustavo