Re: [RFC 00/14] Dynamic Kernel Stacks

From: Kent Overstreet
Date: Thu Mar 14 2024 - 15:06:01 EST


On Tue, Mar 12, 2024 at 02:36:27PM -0700, H. Peter Anvin wrote:
> On 3/12/24 12:45, Pasha Tatashin wrote:
> > >
> > > Ok, first of all, talking about "kernel memory" here is misleading.
> >
> > Hi Peter,
> >
> > I re-read my cover letter, and I do not see where "kernel memory" is
> > mentioned. We are talking about kernel stacks overhead that is
> > proportional to the user workload, as every active thread has an
> > associated kernel stack. The idea is to save memory by not
> > pre-allocating all pages of kernel-stacks, but instead use it as a
> > safeguard when a stack actually becomes deep. Come-up with a solution
> > that can handle rare deeper stacks only when needed. This could be
> > done through faulting on the supported hardware (as proposed in this
> > series), or via pre-map on every schedule event, and checking the
> > access when thread goes off cpu (as proposed by Andy Lutomirski to
> > avoid double faults on x86) .
> >
> > In other words, this feature is only about one very specific type of
> > kernel memory that is not even directly mapped (the feature required
> > vmapped stacks).
> >
> > > Unless your threads are spending nearly all their time sleeping, the
> > > threads will occupy stack and TLS memory in user space as well.
> >
> > Can you please elaborate, what data is contained in the kernel stack
> > when thread is in user space? My series requires thread_info not to be
> > in the stack by depending on THREAD_INFO_IN_TASK.
> >
>
> My point is that what matters is total memory use, not just memory used in
> the kernel. Amdahl's law.

If userspace is running a few processes with many threads and the
userspace stacks are small, kernel stacks could end up dominating.

I'd like to see some numbers though.