Re: [PATCH v2] net: sched: fix memory leak in tcindex_set_parms

From: Hawkins Jiawei
Date: Wed Nov 16 2022 - 07:16:42 EST


On Wed, 16 Nov 2022 at 10:44, Jakub Kicinski <kuba@xxxxxxxxxx> wrote:
>
> On Tue, 15 Nov 2022 19:57:10 +0100 Paolo Abeni wrote:
> > This code confuses me more than a bit, and I don't follow ?!?
>
> It's very confusing :S
>
> For starters I don't know when r != old_r. I mean now it triggers
> randomly after the RCU-ification, but in the original code when
> it was just a memset(). When would old_r ever not be null and yet
> point to a different entry?

I am also confused about the code when I tried to fix this bug.

As for when `old_r != r`, according to the simplified
code below, this should be probably true if `p->perfect` is true
or `!p->perfect && !pc->h` is true(please correct me if I am wrong)

struct tcindex_filter_result new_filter_result, *old_r = r;
struct tcindex_data *cp = NULL, *oldp;
struct tcf_result cr = {};

/* tcindex_data attributes must look atomic to classifier/lookup so
* allocate new tcindex data and RCU assign it onto root. Keeping
* perfect hash and hash pointers from old data.
*/
cp = kzalloc(sizeof(*cp), GFP_KERNEL);

if (p->perfect) {
if (tcindex_alloc_perfect_hash(net, cp) < 0)
goto errout;
...
}
cp->h = p->h;

if (!cp->perfect && !cp->h) {
if (valid_perfect_hash(cp)) {
if (tcindex_alloc_perfect_hash(net, cp) < 0)
goto errout_alloc;

} else {
struct tcindex_filter __rcu **hash;

hash = kcalloc(cp->hash,
sizeof(struct tcindex_filter *),
GFP_KERNEL);

if (!hash)
goto errout_alloc;

cp->h = hash;
}
}
...

if (cp->perfect)
r = cp->perfect + handle;
else
r = tcindex_lookup(cp, handle) ? : &new_filter_result;

if (old_r && old_r != r) {
err = tcindex_filter_result_init(old_r, cp, net);
if (err < 0) {
kfree(f);
goto errout_alloc;
}
}

* If `p->perfect` is true, tcindex_alloc_perfect_hash() newly
alloctes cp->perfect.

* If `!p->perfect && !p->h` is true, cp->perfect or cp->h is
newly allocated.

In either case, r probably points to the newly allocated memory,
which should not equals to the old_r.

>
> > it looks like that at this point:
> >
> > * the data path could access 'old_r->exts' contents via 'p' just before
> > the previous 'tcindex_filter_result_init(old_r, cp, net);' but still
> > potentially within the same RCU grace period
> >
> > * 'tcindex_filter_result_init(old_r, cp, net);' has 'unlinked' the old
> > exts from 'p' so that will not be freed by later
> > tcindex_partial_destroy_work()
> >
> > Overall it looks to me that we need some somewhat wait for the RCU
> > grace period,
>
> Isn't it better to make @cp a deeper copy of @p ?
> I thought it already is but we don't seem to be cloning p->h.
> Also the cloning of p->perfect looks quite lossy.

Yes, I also think @cp should be a deeper copy of @p.

But it seems that in tcindex_alloc_perfect_hash(),
each @cp ->exts will be initialized by tcf_exts_init()
as below, and tcindex_set_parms() forgets to free the
old ->exts content, triggering this memory leak.(Please
correct me if I am wrong)

static int tcindex_alloc_perfect_hash(struct net *net,
struct tcindex_data *cp)
{
int i, err = 0;

cp->perfect = kcalloc(cp->hash, sizeof(struct tcindex_filter_result),
GFP_KERNEL | __GFP_NOWARN);

for (i = 0; i < cp->hash; i++) {
err = tcf_exts_init(&cp->perfect[i].exts, net,
TCA_TCINDEX_ACT, TCA_TCINDEX_POLICE);
if (err < 0)
goto errout;
cp->perfect[i].p = cp;
}
}

static inline int tcf_exts_init(struct tcf_exts *exts, struct net *net,
int action, int police)
{
#ifdef CONFIG_NET_CLS_ACT
exts->type = 0;
exts->nr_actions = 0;
/* Note: we do not own yet a reference on net.
* This reference might be taken later from tcf_exts_get_net().
*/
exts->net = net;
exts->actions = kcalloc(TCA_ACT_MAX_PRIO, sizeof(struct tc_action *),
GFP_KERNEL);
if (!exts->actions)
return -ENOMEM;
#endif
exts->action = action;
exts->police = police;
return 0;
}

>
> > Somewhat side question: it looks like that the 'perfect hashing' usage
> > is the root cause of the issue addressed here, and very likely is
> > afflicted by other problems, e.g. the data curruption in 'err =
> > tcindex_filter_result_init(old_r, cp, net);'.
> >
> > AFAICS 'perfect hashing' usage is a sort of optimization that the user-
> > space may trigger with some combination of the tcindex arguments. I'm
> > wondering if we could drop all perfect hashing related code?
>
> The thought of "how much of this can we delete" did cross my mind :)