Re: [PATCH v2] net: sched: fix memory leak in tcindex_set_parms

From: Hawkins Jiawei
Date: Wed Nov 16 2022 - 08:21:50 EST


On Wed, 16 Nov 2022 at 20:10, Hawkins Jiawei <yin31149@xxxxxxxxx> wrote:
>
> On Wed, 16 Nov 2022 at 10:44, Jakub Kicinski <kuba@xxxxxxxxxx> wrote:
> >
> > On Tue, 15 Nov 2022 19:57:10 +0100 Paolo Abeni wrote:
> > > This code confuses me more than a bit, and I don't follow ?!?
> >
> > It's very confusing :S
> >
> > For starters I don't know when r != old_r. I mean now it triggers
> > randomly after the RCU-ification, but in the original code when
> > it was just a memset(). When would old_r ever not be null and yet
> > point to a different entry?
>
> I am also confused about the code when I tried to fix this bug.
>
> As for when `old_r != r`, according to the simplified
> code below, this should be probably true if `p->perfect` is true
> or `!p->perfect && !pc->h` is true(please correct me if I am wrong)
>
> struct tcindex_filter_result new_filter_result, *old_r = r;
> struct tcindex_data *cp = NULL, *oldp;
> struct tcf_result cr = {};
>
> /* tcindex_data attributes must look atomic to classifier/lookup so
> * allocate new tcindex data and RCU assign it onto root. Keeping
> * perfect hash and hash pointers from old data.
> */
> cp = kzalloc(sizeof(*cp), GFP_KERNEL);
>
> if (p->perfect) {
> if (tcindex_alloc_perfect_hash(net, cp) < 0)
> goto errout;
> ...
> }
> cp->h = p->h;
>
> if (!cp->perfect && !cp->h) {
> if (valid_perfect_hash(cp)) {
> if (tcindex_alloc_perfect_hash(net, cp) < 0)
> goto errout_alloc;
>
> } else {
> struct tcindex_filter __rcu **hash;
>
> hash = kcalloc(cp->hash,
> sizeof(struct tcindex_filter *),
> GFP_KERNEL);
>
> if (!hash)
> goto errout_alloc;
>
> cp->h = hash;
> }
> }
> ...
>
> if (cp->perfect)
> r = cp->perfect + handle;
> else
> r = tcindex_lookup(cp, handle) ? : &new_filter_result;
>
> if (old_r && old_r != r) {
> err = tcindex_filter_result_init(old_r, cp, net);
> if (err < 0) {
> kfree(f);
> goto errout_alloc;
> }
> }
>
> * If `p->perfect` is true, tcindex_alloc_perfect_hash() newly
> alloctes cp->perfect.
>
> * If `!p->perfect && !p->h` is true, cp->perfect or cp->h is
> newly allocated.
>
> In either case, r probably points to the newly allocated memory,
> which should not equals to the old_r.

Sorry for the error. In the second case, `r` is possibly
pointing to the `&new_filter_result`, which is a stack variable
address, and should still not equal to the `old_r`.

>
> >
> > > it looks like that at this point:
> > >
> > > * the data path could access 'old_r->exts' contents via 'p' just before
> > > the previous 'tcindex_filter_result_init(old_r, cp, net);' but still
> > > potentially within the same RCU grace period
> > >
> > > * 'tcindex_filter_result_init(old_r, cp, net);' has 'unlinked' the old
> > > exts from 'p' so that will not be freed by later
> > > tcindex_partial_destroy_work()
> > >
> > > Overall it looks to me that we need some somewhat wait for the RCU
> > > grace period,
> >
> > Isn't it better to make @cp a deeper copy of @p ?
> > I thought it already is but we don't seem to be cloning p->h.
> > Also the cloning of p->perfect looks quite lossy.
>
> Yes, I also think @cp should be a deeper copy of @p.
>
> But it seems that in tcindex_alloc_perfect_hash(),
> each @cp ->exts will be initialized by tcf_exts_init()
> as below, and tcindex_set_parms() forgets to free the
> old ->exts content, triggering this memory leak.(Please
> correct me if I am wrong)
>
> static int tcindex_alloc_perfect_hash(struct net *net,
> struct tcindex_data *cp)
> {
> int i, err = 0;
>
> cp->perfect = kcalloc(cp->hash, sizeof(struct tcindex_filter_result),
> GFP_KERNEL | __GFP_NOWARN);
>
> for (i = 0; i < cp->hash; i++) {
> err = tcf_exts_init(&cp->perfect[i].exts, net,
> TCA_TCINDEX_ACT, TCA_TCINDEX_POLICE);
> if (err < 0)
> goto errout;
> cp->perfect[i].p = cp;
> }
> }
>
> static inline int tcf_exts_init(struct tcf_exts *exts, struct net *net,
> int action, int police)
> {
> #ifdef CONFIG_NET_CLS_ACT
> exts->type = 0;
> exts->nr_actions = 0;
> /* Note: we do not own yet a reference on net.
> * This reference might be taken later from tcf_exts_get_net().
> */
> exts->net = net;
> exts->actions = kcalloc(TCA_ACT_MAX_PRIO, sizeof(struct tc_action *),
> GFP_KERNEL);
> if (!exts->actions)
> return -ENOMEM;
> #endif
> exts->action = action;
> exts->police = police;
> return 0;
> }
>
> >
> > > Somewhat side question: it looks like that the 'perfect hashing' usage
> > > is the root cause of the issue addressed here, and very likely is
> > > afflicted by other problems, e.g. the data curruption in 'err =
> > > tcindex_filter_result_init(old_r, cp, net);'.
> > >
> > > AFAICS 'perfect hashing' usage is a sort of optimization that the user-
> > > space may trigger with some combination of the tcindex arguments. I'm
> > > wondering if we could drop all perfect hashing related code?
> >
> > The thought of "how much of this can we delete" did cross my mind :)