Re: [PATCH 34/40] lib: code tagging context capture support

From: Suren Baghdasaryan
Date: Thu May 04 2023 - 10:31:24 EST


On Thu, May 4, 2023 at 1:04 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
>
> On Wed 03-05-23 08:18:39, Suren Baghdasaryan wrote:
> > On Wed, May 3, 2023 at 12:36 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
> > >
> > > On Mon 01-05-23 09:54:44, Suren Baghdasaryan wrote:
> > > [...]
> > > > +static inline void add_ctx(struct codetag_ctx *ctx,
> > > > + struct codetag_with_ctx *ctc)
> > > > +{
> > > > + kref_init(&ctx->refcount);
> > > > + spin_lock(&ctc->ctx_lock);
> > > > + ctx->flags = CTC_FLAG_CTX_PTR;
> > > > + ctx->ctc = ctc;
> > > > + list_add_tail(&ctx->node, &ctc->ctx_head);
> > > > + spin_unlock(&ctc->ctx_lock);
> > >
> > > AFAIU every single tracked allocation will get its own codetag_ctx.
> > > There is no aggregation per allocation site or anything else. This looks
> > > like a scalability and a memory overhead red flag to me.
> >
> > True. The allocations here would not be limited. We could introduce a
> > global limit to the amount of memory that we can use to store contexts
> > and maybe reuse the oldest entry (in LRU fashion) when we hit that
> > limit?
>
> Wouldn't it make more sense to aggregate same allocations? Sure pids
> get recycled but quite honestly I am not sure that information is all
> that interesting. Precisely because of the recycle and short lived
> processes reasons. I think there is quite a lot to think about the
> detailed context tracking.

That would be a nice optimization. I'll need to look into the
implementation details. Thanks for the idea.

>
> > >
> > > > +}
> > > > +
> > > > +static inline void rem_ctx(struct codetag_ctx *ctx,
> > > > + void (*free_ctx)(struct kref *refcount))
> > > > +{
> > > > + struct codetag_with_ctx *ctc = ctx->ctc;
> > > > +
> > > > + spin_lock(&ctc->ctx_lock);
> > >
> > > This could deadlock when allocator is called from the IRQ context.
> >
> > I see. spin_lock_irqsave() then?
>
> yes. I have checked that the lock is not held over the all list
> traversal which is good but the changelog could be more explicit about
> the iterators and lock hold times implications.

Ack. Will add more information.

>
> --
> Michal Hocko
> SUSE Labs