Re: [PATCH] mm/page_owner: Record timestamp and pid

From: Vlastimil Babka
Date: Fri Nov 27 2020 - 14:10:04 EST


On 11/27/20 7:57 PM, Georgi Djakov wrote:
Hi Vlastimil,

Thanks for the comment!

On 11/27/20 19:52, Vlastimil Babka wrote:
On 11/12/20 8:14 PM, Andrew Morton wrote:
On Thu, 12 Nov 2020 20:41:06 +0200 Georgi Djakov <georgi.djakov@xxxxxxxxxx> wrote:

From: Liam Mark <lmark@xxxxxxxxxxxxxx>

Collect the time for each allocation recorded in page owner so that
allocation "surges" can be measured.

Record the pid for each allocation recorded in page owner so that
the source of allocation "surges" can be better identified.

Please provide a description of why this is considered useful.  What
has it been used for, what problems has it been used to solve?

Worth noting that on x86_64 it doubles the size of struct page_owner
from 16 bytes to 32, so it better be justified:

Well, that's true. But for debug options there is almost always some penalty.
The timestamp and pid information is very useful for me (and others, i believe)
when doing memory analysis. On a crash for example, we can get this information
from kdump (or RAM-dump) and look into it to catch memory allocation problems
more easily.

Right. Btw, you should add printing the info to __dump_page_owner().

If you find the above argument not strong enough, how about a separate config
option for this? Maybe something like CONFIG_PAGE_OWNER_EXTENDED, which could
be enabled in addition to CONFIG_PAGE_OWNER?

It might be strong enough if it's mentioned in changelog, and also what exactly the space tradeoff is :)

You can also mention that SLUB object tracking has also pid+timestamp.

Thanks,
Georgi


struct page_owner {
        short unsigned int         order;                /*     0     2 */
        short int                  last_migrate_reason;  /*     2     2 */
        gfp_t                      gfp_mask;             /*     4     4 */
        depot_stack_handle_t       handle;               /*     8     4 */
        depot_stack_handle_t       free_handle;          /*    12     4 */
        u64                        ts_nsec;              /*    16     8 */
        int                        pid;                  /*    24     4 */

        /* size: 32, cachelines: 1, members: 7 */
        /* padding: 4 */
        /* last cacheline: 32 bytes */
};