Re: [PATCH] net/mlx5e: fix a double-free in arfs_create_groups

From: alexious
Date: Mon Jan 08 2024 - 04:34:54 EST




> On Sun, Dec 24, 2023 at 04:13:48PM +0800, Zhipeng Lu wrote:
> > When `in` allocated by kvzalloc fails, arfs_create_groups will free
> > ft->g and return an error. However, arfs_create_table, the only caller of
> > arfs_create_groups, will hold this error and call to
> > mlx5e_destroy_flow_table, in which the ft->g will be freed again.
> >
> > Fixes: 1cabe6b0965e ("net/mlx5e: Create aRFS flow tables")
> > Signed-off-by: Zhipeng Lu <alexious@xxxxxxxxxx>
>
> Thanks,
>
> I agree this addresses the issue that you describe.
> And as a minimal fix it looks good.
>
> Reviewed-by: Simon Horman <horms@xxxxxxxxxx>
>
> However, I would like to suggest that some clean-up work could
> take place as a follow-up.
>
> I think that the error handling in this area of the code
> is rather fragile. This is because initialisation is not necessarily
> unwound on error within the function that initialisation occurs.
>
> I think it would be better if arfs_create_groups():
>
> 1. Released allocates resources it allocates, including ft->g and
> elements of ft->g, on error.
> 2. This was achieved by using a goto unwind ladder.
> 3. The caller treated ft->g as uninitialised if
> arfs_create_groups fails.
>

Agree, I think a unwind ladder for arfs_create_groups is much better.
I'll follow this idea to send a v2 patch later.
Another comment below.

> Likewise, I think that:
>
> * arfs_create_groups, should initialise ft->num_groups
>
> And further, logic similar to the above should guide
> how arfs_create_table() initialises ft->t and cleans it
> up on error.
>

I think that ft->t you mentioned refers to mlx5_create_flow_table.
I'd like to make the life cycle of ft->t similar to ft->g in arfs_create_groups,
but it needs to add an argument for mlx5_create_flow_table to transfer ft to
it. However, mlx5_create_flow_table is called in more than 30 different places
throughout the kernel. So such modification could be another refactoring patch
but may be out of this fix patch's duty.

> I did not look at the code beyond the scope described above.
> But the above are general principles that may well apply in
> other nearby code too.
>
> ...