Re: [PATCH v3] mm: fix tick timer stall during deferred page init

From: Daniel Jordan
Date: Fri Mar 27 2020 - 20:17:31 EST


On Fri, Mar 27, 2020 at 12:39:18PM +0800, Shile Zhang wrote:
> On 2020/3/27 03:36, Pavel Tatashin wrote:
> > I agree with Daniel, we should look into approach where
> > pgdat_resize_lock is taken only for the duration of updating tracking
> > values such as pgdat->first_deferred_pfn (perhaps we would need to add
> > another tracker that would show chunks that are currently being worked
> > on).
> >
> > The vast duration of struct page initialization process should happen
> > outside of this lock, and only be taken when we update globally seen
> > data structures: lists, tracking variables. This way we can solve
> > several problems: 1. allow interrupt threads to grow zones if
> > required. 2. keep jiffies happy. 3. allow future scaling when we will
> > add inner node threads to initialize struct pages (i.e. ktasks from
> > Daniel).
>
> It make sense, looking forward to the inner node parallel init.
>
> @Daniel
> Is there schedule about ktasks?

Yep, and it's now padata multithreading instead of ktask since we already have
'task' in the kernel.

Current plan is to start with users in the system context, that is those that
don't require userland resource controls such as cgroup. So I'll post a new
version of this timestamp fix pretty soon and then likely post a series that
multithreads page init.

Future work is tentatively doing other system users, remote charging for the
CPU controller, and then users that can be accounted with cgroup etc.