Re: [PATCH v8 2/5] mm,page_owner: Implement the tracking of the stacks count

From: Oscar Salvador
Date: Tue Feb 13 2024 - 04:51:39 EST


On Tue, Feb 13, 2024 at 10:16:04AM +0100, Vlastimil Babka wrote:
> On 2/12/24 23:30, Oscar Salvador wrote:
> > +static void add_stack_record_to_list(struct stack_record *stack_record)
> > +{
> > + unsigned long flags;
> > + struct stack *stack;
> > +
> > + stack = kmalloc(sizeof(*stack), GFP_KERNEL);
>
> I doubt you can use GFP_KERNEL unconditionally? Think you need to pass down
> the gfp flags from __set_page_owner() here?

Yes, we should re-use the same gfp flags.

> And what about the alloc failure case, this will just leave the stack record
> unlinked forever? Can we somehow know which ones we failed to link, and try
> next time? Probably easier by not recording the stack for the page at all in
> that case, so the next attempt with the same stack looks like the very first
> again and attemps the add to list.

Well, it is not that easy.
We could, before even calling into stackdepot to save the trace, try to
allocate a "struct stack", and skip everything if that fails, so
subsequent calls will try again as if this was the first time.

The thing I do not like about that is that it gets in the way of
traditional page_owner functionality (to put it a name), meaning that
if we cannot allocate a "struct stack", we also do not log the page
and the stack as we usually do, which means we lose that information.

One could argue that if we are so screwed that we cannot allocate that
struct, we may also fail deep in stackdepot code when allocating
the stack_record struct, but who knows.
I tried to keep both features as independent as possible.

> Still not happy that these extra tracking objects are needed, but I guess
> the per-users stack depot instances I suggested would be a major change.

Well, it is definitely something I could have a look in a short-term
future, to see if it makes sense, but for now I would go this the
current state as it has a well balanced complexity vs optimization.

> > + if (stack_record) {
> > + /*
> > + * New stack_record's that do not use STACK_DEPOT_FLAG_GET start
> > + * with REFCOUNT_SATURATED to catch spurious increments of their
> > + * refcount.
> > + * Since we do not use STACK_DEPOT_FLAG_{GET,PUT} API, let us
> > + * set a refcount of 1 ourselves.
> > + */
> > + if (refcount_read(&stack_record->count) == REFCOUNT_SATURATED) {
> > + refcount_set(&stack_record->count, 1);
>
> Isn't this racy? Shouldn't we use some atomic cmpxchg operation to change
> from REFCOUNT_SATURATED to 1?

Yeah, missed that. Probably check it under the lock as Maroc suggested.

Thanks Vlastimil for the feedback!


--
Oscar Salvador
SUSE Labs