Re: [PATCH v10 3/3] mm: add anonymous vma name refcounting

From: Kees Cook
Date: Tue Oct 05 2021 - 15:21:34 EST


On Tue, Oct 05, 2021 at 12:14:59PM -0700, Suren Baghdasaryan wrote:
> On Tue, Oct 5, 2021 at 11:42 AM Pavel Machek <pavel@xxxxxx> wrote:
> >
> > On Fri 2021-10-01 13:56:57, Suren Baghdasaryan wrote:
> > > While forking a process with high number (64K) of named anonymous vmas the
> > > overhead caused by strdup() is noticeable. Experiments with ARM64
> > Android
> >
> > I still believe you should simply use numbers and do the
> > numbers->strings mapping in userspace. We should not need to optimize
> > strdups in kernel...
>
> Here are complications with mapping numbers to strings in the userspace:
> Approach 1: hardcode number->string in some header file and let all
> tools use that mapping. The issue is that whenever that mapping
> changes all the tools that are using it (including 3rd party ones)
> have to be rebuilt. This is not really maintainable since we don't
> control 3rd party tools and even for the ones we control, it will be a
> maintenance issue figuring out which version of the tool used which
> header file.
> Approach 2: have a centralized facility (a process or a DB)
> maintaining number->string mapping. This would require an additional
> request to this facility whenever we want to make a number->string
> conversion. Moreover, when we want to name a VMA, we would have to
> register a new VMA name in that facility or check that one already
> exists and get its ID. So each prctl() call to name a VMA will be
> preceded by such a request (IPC call), maybe with some optimizations
> to cache already known number->string pairs. This would be quite
> expensive performance-wise. Additional issue with this approach is
> that this mapping will have to be persistent to handle a case when the
> facility crashes and has to be restored.
>
> As I said before, it complicates userspace quite a bit. Is that a good
> enough reason to store the names in the kernel and pay a little more
> memory for that? IMHO yes, but I might be wrong.

FWIW, I prefer the strings. It's more human-readable, which is important
for the kinds of cases where the maps are being used for diagnostics,
etc.

--
Kees Cook