Re: [PATCH] net: sched: fix memory leak in tcindex_set_parms

From: Hawkins Jiawei
Date: Thu Nov 03 2022 - 12:08:12 EST


Hi Jakub,
On Thu, 3 Nov 2022 at 11:26, Jakub Kicinski <kuba@xxxxxxxxxx> wrote:
>
> On Mon, 31 Oct 2022 14:08:35 +0800 Hawkins Jiawei wrote:
> > Kernel will uses tcindex_change() to change an existing
>
> s/will//
>
> > traffic-control-indices filter properties. During the
> > process of changing, kernel will clears the old
>
> s/will//
>
> > traffic-control-indices filter result, and updates it
> > by RCU assigning new traffic-control-indices data.
> >
> > Yet the problem is that, kernel will clears the old
>
> s/will//
Thanks for the suggestion. I will amend these in the v2 patch.

>
> > traffic-control-indices filter result, without destroying
> > its tcf_exts structure, which triggers the above
> > memory leak.
> >
> > This patch solves it by using tcf_exts_destroy() to
> > destroy the tcf_exts structure in old
> > traffic-control-indices filter result.
> >
>
> Please provide a Fixes tag to where the problem was introduced
> (or the initial git commit).
Thanks for reminding, it seems that the problem was
introduced by commit
b9a24bb76bf6 ("net_sched: properly handle failure case of tcf_exts_init()"),
because it was in this commit that kernel allocated the struct tcf_exts
for new traffic-control-indices filter result in tcindex_alloc_perfect_hash().

I will add the tag in the v2 patch.

>
> > Link: https://lore.kernel.org/all/0000000000001de5c505ebc9ec59@xxxxxxxxxx/
> > Reported-by: syzbot+232ebdbd36706c965ebf@xxxxxxxxxxxxxxxxxxxxxxxxx
> > Tested-by: syzbot+232ebdbd36706c965ebf@xxxxxxxxxxxxxxxxxxxxxxxxx
> > Signed-off-by: Hawkins Jiawei <yin31149@xxxxxxxxx>
> > ---
> > net/sched/cls_tcindex.c | 14 ++++++++++++++
> > 1 file changed, 14 insertions(+)
> >
> > diff --git a/net/sched/cls_tcindex.c b/net/sched/cls_tcindex.c
> > index 1c9eeb98d826..dc872a794337 100644
> > --- a/net/sched/cls_tcindex.c
> > +++ b/net/sched/cls_tcindex.c
> > @@ -338,6 +338,9 @@ tcindex_set_parms(struct net *net, struct tcf_proto *tp, unsigned long base,
> > struct tcf_result cr = {};
> > int err, balloc = 0;
> > struct tcf_exts e;
> > +#ifdef CONFIG_NET_CLS_ACT
> > + struct tcf_exts old_e = {};
> > +#endif
>
> Why all the ifdefs?
Thanks for suggestion, it seems that these ifdefs are not needed.
I will delete these in the v2 patch.

>
> > err = tcf_exts_init(&e, net, TCA_TCINDEX_ACT, TCA_TCINDEX_POLICE);
> > if (err < 0)
> > @@ -479,6 +482,14 @@ tcindex_set_parms(struct net *net, struct tcf_proto *tp, unsigned long base,
> > }
> >
> > if (old_r && old_r != r) {
> > +#ifdef CONFIG_NET_CLS_ACT
> > + /* r->exts is not copied from old_r->exts, and
> > + * the following code will clears the old_r, so
> > + * we need to destroy it after updating the tp->root,
> > + * to avoid memory leak bug.
> > + */
> > + old_e = old_r->exts;
> > +#endif
>
> Can't you localize all the changes to this if block?
>
> Maybe add a function called tcindex_filter_result_reinit()
> which will act more appropriately?
I think we shouldn't put the tcf_exts_destroy(&old_e)
into this if block, or other RCU readers may derefer the
freed memory (Please correct me If I am wrong).

So I put the tcf_exts_destroy(&old_e) near the tcindex
destroy work, after the RCU updateing.

>
> > err = tcindex_filter_result_init(old_r, cp, net);
> > if (err < 0) {
> > kfree(f);
> > @@ -510,6 +521,9 @@ tcindex_set_parms(struct net *net, struct tcf_proto *tp, unsigned long base,
> > tcf_exts_destroy(&new_filter_result.exts);
> > }
> >
> > +#ifdef CONFIG_NET_CLS_ACT
> > + tcf_exts_destroy(&old_e);
> > +#endif
> > if (oldp)
> > tcf_queue_work(&oldp->rwork, tcindex_partial_destroy_work);
> > return 0;