Re: arm32 build warnings in workqueue.c

From: Linus Torvalds
Date: Fri Jun 23 2023 - 14:24:44 EST


On Thu, 22 Jun 2023 at 20:57, Dave Airlie <airlied@xxxxxxxxx> wrote:
>
> Not sure what changed (maybe I ended up with -Werror on recently), but
> my 32-bit arm build started to fail. 6.4.0-rc7.
>
> gcc version 13.1.1 20230519 (Red Hat Cross 13.1.1-2) (GCC)
>
> /home/airlied/devel/kernel/dim/drm-fixes/kernel/workqueue.c: In
> function ‘get_work_pwq’:
> /home/airlied/devel/kernel/dim/drm-fixes/kernel/workqueue.c:713:24:
> error: cast to pointer from integer of different size
> [-Werror=int-to-pointer-cast]
> 713 | return (void *)(data & WORK_STRUCT_WQ_DATA_MASK);
> | ^

Ok, that's some nasty code, but I'm not entirely sure why gcc has
started complaining about it now.

'data' is of type 'unsigned long', and we cast things to pointers all the time..

Now, WORK_STRUCT_WQ_DATA_MASK is this odd enum type, and very very
arguably that is absolutely *horrible*, but the code does

enum {
[..]
WORK_STRUCT_FLAG_BITS = WORK_STRUCT_COLOR_SHIFT +
WORK_STRUCT_COLOR_BITS,
[..]
WORK_STRUCT_FLAG_MASK = (1UL << WORK_STRUCT_FLAG_BITS) - 1,
WORK_STRUCT_WQ_DATA_MASK = ~WORK_STRUCT_FLAG_MASK,
}

and while the above is absolutely disgusting and the type is not very
well defined, an enum type should be wide enough to contain all the
values of the enum. It should all default to 'int', but gcc has some
extensions.

The 'intent' here is that WORK_STRUCT_FLAG_MASK is of type UL, but
that's not actually how enum types necessarily work. Again, the enum
should be large enough to contain the *values* of the elements, not
necessarily the *types* of the elements. The type is always going to
be that enum.

(And again, I think according to the standard, it's always of type
'int', but every compiler does some kind of type widening for larger
values as an extension, and may use a smaller type for storage).

But even if WORK_STRUCT_FLAG_MASK isn't an unsigned long, and is
something smaller, the expression '~WORK_STRUCT_FLAG_MASK' must be *at
least* an 'int', and would be a negative one if so. That's just how C
integer expressions work - the '~' operator is guaranteed to make it
at *least* an int.

Now, again, the final type of an enum is not determined by the types
of the element initializers, but by their *values*. But that means
that the final type of an enum will have two choices:

- either ~WORK_STRUCT_FLAG_MASK was extended to be 'int', and the
value is negative, and the enum type has to be a signed type that can
contain that negative value.

IOW, the enum might actually be of type 'signed char', but then
doing that 'data & enum' will sign-extend the mask and everything is
fine.

- *or* the type of WORK_STRUCT_FLAG_MASK was interpreted to be an
unsigned type >= int, and the negation made it a really big unsigned
value, and that enum must thus be a large unsigned type to fit it

IOW, the enum might be an unsigned type, but it will be at LEAST of
size 'int', and I don't see why it would be a problem on 32-bit ARM.
Making it be 'unsigned long long' sounds insane. But maybe that is
what happened.

In either case, the code above should do the right thing, at least on
32-bit, but the warning is, I feel, valid.

Because as you can tell, the 'type' of an enum really isn't very obvious.

I think the whole 'type of an enum' is not only a very very bad thing
to rely on, I think it's something that C23 ends up codifying new
rules for. It may be why gcc started complaining now.

Anyway, that code absolutely has to be fixed. Using enums for types is
wrong, wrong, wrong. You should consider an enum to be a weakly typed
integer expression with some specific advantages (the automatic
incrementing at definition time, and the compiler being able to limit
ranges in switch() statements and maybe doing limited type warnings
elsewhere)

Any time you think it has a specific type size, you're setting
yourself up for pain and suffering. Gcc is right to warn when we do
odd arithmetic with it and rely on the width of the result.

So I really think that code needs fixing, and the gcc warning was very valid.

Maybe something like the attached. Does this fix the gcc warning?
Tejun, comments?

Linus
include/linux/workqueue.h | 15 ++++++++-------
kernel/workqueue.c | 13 ++++++++-----
2 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 3992c994787f..683efe29fa69 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -68,7 +68,6 @@ enum {
WORK_OFFQ_FLAG_BASE = WORK_STRUCT_COLOR_SHIFT,

__WORK_OFFQ_CANCELING = WORK_OFFQ_FLAG_BASE,
- WORK_OFFQ_CANCELING = (1 << __WORK_OFFQ_CANCELING),

/*
* When a work item is off queue, its high bits point to the last
@@ -79,12 +78,6 @@ enum {
WORK_OFFQ_POOL_SHIFT = WORK_OFFQ_FLAG_BASE + WORK_OFFQ_FLAG_BITS,
WORK_OFFQ_LEFT = BITS_PER_LONG - WORK_OFFQ_POOL_SHIFT,
WORK_OFFQ_POOL_BITS = WORK_OFFQ_LEFT <= 31 ? WORK_OFFQ_LEFT : 31,
- WORK_OFFQ_POOL_NONE = (1LU << WORK_OFFQ_POOL_BITS) - 1,
-
- /* convenience constants */
- WORK_STRUCT_FLAG_MASK = (1UL << WORK_STRUCT_FLAG_BITS) - 1,
- WORK_STRUCT_WQ_DATA_MASK = ~WORK_STRUCT_FLAG_MASK,
- WORK_STRUCT_NO_POOL = (unsigned long)WORK_OFFQ_POOL_NONE << WORK_OFFQ_POOL_SHIFT,

/* bit mask for work_busy() return values */
WORK_BUSY_PENDING = 1 << 0,
@@ -94,6 +87,14 @@ enum {
WORKER_DESC_LEN = 24,
};

+/* Convenience constants - of type 'unsigned long', not 'enum'! */
+#define WORK_OFFQ_CANCELING (1ul << __WORK_OFFQ_CANCELING)
+#define WORK_OFFQ_POOL_NONE ((1ul << WORK_OFFQ_POOL_BITS) - 1)
+#define WORK_STRUCT_NO_POOL (WORK_OFFQ_POOL_NONE << WORK_OFFQ_POOL_SHIFT)
+
+#define WORK_STRUCT_FLAG_MASK ((1ul << WORK_STRUCT_FLAG_BITS) - 1)
+#define WORK_STRUCT_WQ_DATA_MASK (~WORK_STRUCT_FLAG_MASK)
+
struct work_struct {
atomic_long_t data;
struct list_head entry;
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 4666a1a92a31..c913e333cce8 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -705,12 +705,17 @@ static void clear_work_data(struct work_struct *work)
set_work_data(work, WORK_STRUCT_NO_POOL, 0);
}

+static inline struct pool_workqueue *work_struct_pwq(unsigned long data)
+{
+ return (struct pool_workqueue *)(data & WORK_STRUCT_WQ_DATA_MASK);
+}
+
static struct pool_workqueue *get_work_pwq(struct work_struct *work)
{
unsigned long data = atomic_long_read(&work->data);

if (data & WORK_STRUCT_PWQ)
- return (void *)(data & WORK_STRUCT_WQ_DATA_MASK);
+ return work_struct_pwq(data);
else
return NULL;
}
@@ -738,8 +743,7 @@ static struct worker_pool *get_work_pool(struct work_struct *work)
assert_rcu_or_pool_mutex();

if (data & WORK_STRUCT_PWQ)
- return ((struct pool_workqueue *)
- (data & WORK_STRUCT_WQ_DATA_MASK))->pool;
+ return work_struct_pwq(data)->pool;

pool_id = data >> WORK_OFFQ_POOL_SHIFT;
if (pool_id == WORK_OFFQ_POOL_NONE)
@@ -760,8 +764,7 @@ static int get_work_pool_id(struct work_struct *work)
unsigned long data = atomic_long_read(&work->data);

if (data & WORK_STRUCT_PWQ)
- return ((struct pool_workqueue *)
- (data & WORK_STRUCT_WQ_DATA_MASK))->pool->id;
+ return work_struct_pwq(data)->pool->id;

return data >> WORK_OFFQ_POOL_SHIFT;
}