Re: [PATCH v2] net: sched: fix memory leak in tcindex_set_parms

From: Jakub Kicinski
Date: Tue Nov 15 2022 - 21:44:50 EST


On Tue, 15 Nov 2022 19:57:10 +0100 Paolo Abeni wrote:
> This code confuses me more than a bit, and I don't follow ?!?

It's very confusing :S

For starters I don't know when r != old_r. I mean now it triggers
randomly after the RCU-ification, but in the original code when
it was just a memset(). When would old_r ever not be null and yet
point to a different entry?

> it looks like that at this point:
>
> * the data path could access 'old_r->exts' contents via 'p' just before
> the previous 'tcindex_filter_result_init(old_r, cp, net);' but still
> potentially within the same RCU grace period
>
> * 'tcindex_filter_result_init(old_r, cp, net);' has 'unlinked' the old
> exts from 'p' so that will not be freed by later
> tcindex_partial_destroy_work() 
>
> Overall it looks to me that we need some somewhat wait for the RCU
> grace period,

Isn't it better to make @cp a deeper copy of @p ?
I thought it already is but we don't seem to be cloning p->h.
Also the cloning of p->perfect looks quite lossy.

> Somewhat side question: it looks like that the 'perfect hashing' usage
> is the root cause of the issue addressed here, and very likely is
> afflicted by other problems, e.g. the data curruption in 'err =
> tcindex_filter_result_init(old_r, cp, net);'.
>
> AFAICS 'perfect hashing' usage is a sort of optimization that the user-
> space may trigger with some combination of the tcindex arguments. I'm
> wondering if we could drop all perfect hashing related code?

The thought of "how much of this can we delete" did cross my mind :)