Re: [RFC 2/2] kread: avoid duplicates

From: Luis Chamberlain
Date: Sun Apr 16 2023 - 14:47:11 EST


On Sun, Apr 16, 2023 at 02:50:01PM +0200, Greg KH wrote:
> On Sat, Apr 15, 2023 at 11:41:28PM -0700, Luis Chamberlain wrote:
> > On Sat, Apr 15, 2023 at 11:04:12PM -0700, Christoph Hellwig wrote:
> > > On Thu, Apr 13, 2023 at 10:28:40PM -0700, Luis Chamberlain wrote:
> > > > With this we run into 0 wasted virtual memory bytes.
> > >
> > > Avoid what duplicates?
> >
> > David Hildenbrand had reported that with over 400 CPUs vmap space
> > runs out and it seems it was related to module loading. I took a
> > look and confirmed it. Module loading ends up requiring in the
> > worst case 3 vmalloc allocations, so typically at least twice
> > the size of the module size and in the worst case just add
> > the decompressed module size:
> >
> > a) initial kernel_read*() call
> > b) optional module decompression
> > c) the actual module data copy we will keep
> >
> > Duplicate module requests that come from userspace end up being thrown
> > in the trash bin, as only one module will be allocated. Although there
> > are checks for a module prior to requesting a module udev still doesn't
> > do the best of a job to avoid that and so we end up with tons of
> > duplicate module requests. We're talking about gigabytes of vmalloc
> > bytes just lost because of this for large systems and megabytes for
> > average systems. So for example with just 255 CPUs we can loose about
> > 13.58 GiB, and for 8 CPUs about 226.53 MiB.
>
> How does the memory get "lost"? Shouldn't it be properly freed when the
> duplicate module load fails?

Yes memory gets freed, but since virtual memory space can be limitted it
also means you can end up eventually getting to the point -ENOMEMs will
happen as you have more CPUS and you cannot use virtual memory for other
things during kernel bootup and bootup fails. This is apparently
exacerbated with KASAN enabled.

Luis