Re: [PATCH] net/mlx5e: fix a double-free in arfs_create_groups

From: Simon Horman
Date: Mon Jan 08 2024 - 06:05:57 EST


On Mon, Jan 08, 2024 at 05:12:06PM +0800, alexious@xxxxxxxxxx wrote:
>
>
> > On Sun, Dec 24, 2023 at 04:13:48PM +0800, Zhipeng Lu wrote:
> > > When `in` allocated by kvzalloc fails, arfs_create_groups will free
> > > ft->g and return an error. However, arfs_create_table, the only caller of
> > > arfs_create_groups, will hold this error and call to
> > > mlx5e_destroy_flow_table, in which the ft->g will be freed again.
> > >
> > > Fixes: 1cabe6b0965e ("net/mlx5e: Create aRFS flow tables")
> > > Signed-off-by: Zhipeng Lu <alexious@xxxxxxxxxx>
> >
> > Thanks,
> >
> > I agree this addresses the issue that you describe.
> > And as a minimal fix it looks good.
> >
> > Reviewed-by: Simon Horman <horms@xxxxxxxxxx>
> >
> > However, I would like to suggest that some clean-up work could
> > take place as a follow-up.
> >
> > I think that the error handling in this area of the code
> > is rather fragile. This is because initialisation is not necessarily
> > unwound on error within the function that initialisation occurs.
> >
> > I think it would be better if arfs_create_groups():
> >
> > 1. Released allocates resources it allocates, including ft->g and
> > elements of ft->g, on error.
> > 2. This was achieved by using a goto unwind ladder.
> > 3. The caller treated ft->g as uninitialised if
> > arfs_create_groups fails.
> >
>
> Agree, I think a unwind ladder for arfs_create_groups is much better.
> I'll follow this idea to send a v2 patch later.

Thanks.

> Another comment below.
>
> > Likewise, I think that:
> >
> > * arfs_create_groups, should initialise ft->num_groups
> >
> > And further, logic similar to the above should guide
> > how arfs_create_table() initialises ft->t and cleans it
> > up on error.
> >
>
> I think that ft->t you mentioned refers to mlx5_create_flow_table.
> I'd like to make the life cycle of ft->t similar to ft->g in arfs_create_groups,
> but it needs to add an argument for mlx5_create_flow_table to transfer ft to
> it. However, mlx5_create_flow_table is called in more than 30 different places
> throughout the kernel. So such modification could be another refactoring patch
> but may be out of this fix patch's duty.

I agree there is no need to solve all problems in this patch :)

> > I did not look at the code beyond the scope described above.
> > But the above are general principles that may well apply in
> > other nearby code too.
> >
> > ...