Re: [PATCH 14/14] fs,xfs: Allow kswapd to writeback pages

From: KAMEZAWA Hiroyuki
Date: Mon Jul 05 2010 - 20:50:04 EST


On Mon, 5 Jul 2010 15:16:40 +0100
Mel Gorman <mel@xxxxxxxxx> wrote:

> > > A slightly greater concern is that clean pages can be temporarily "lost"
> > > on the cleaning list. If a direct reclaimer moves pages to the LRU_CLEANING
> > > list, it's no longer considering those pages even if a flusher thread
> > > happened to clean those pages before kswapd had a chance. Lets say under
> > > heavy memory pressure a lot of pages are being dirties and encountered on
> > > the LRU list. They move to LRU_CLEANING where dirty balancing starts making
> > > sure they get cleaned but are no longer being reclaimed.
> > >
> > > Of course, I might be wrong but it's not a trivial direction to take.
> > >
> >
> > I hope dirty_ratio at el may help us. But I agree this "hiding" can cause
> > issue.
> > IIRC, someone wrote a patch to prevent too many threads enter vmscan..
> > such kinds of work may be necessary.
> >
>
> Using systemtap, I have found in global reclaim at least that the ratio of
> dirty to clean pages is not a problem. What does appear to be a problem is
> that dirty pages are getting to the end of the inactive file list while
> still dirty but I haven't formulated a theory as to why yet - maybe it's
> because the dirty balancing is cleaning new pages first? Right now, I
> believe dirty_ratio is working as expected but old dirty pages is a problem.
>

Hmm. IIUC, dirty pages put back to the tail of LRU will be moved to the head
when writeback finishs if PG_reclaim is set. This is maybe for finding clean
pages in the next vmscan.


> > > > <SNIP>
> > > > @@ -2275,7 +2422,9 @@ static int kswapd(void *p)
> > > > prepare_to_wait(&pgdat->kswapd_wait, &wait, TASK_INTERRUPTIBLE);
> > > > new_order = pgdat->kswapd_max_order;
> > > > pgdat->kswapd_max_order = 0;
> > > > - if (order < new_order) {
> > > > + if (need_to_cleaning_node(pgdat)) {
> > > > + launder_pgdat(pgdat);
> > > > + } else if (order < new_order) {
> > > > /*
> > > > * Don't sleep if someone wants a larger 'order'
> > > > * allocation
> > >
> > > I see the direction you are thinking of but I have big concerns about clean
> > > pages getting delayed for too long on the LRU_CLEANING pages before kswapd
> > > puts them back in the right place. I think a safer direction would be for
> > > memcg people to investigate Andrea's "switch stack" suggestion.
> > >
> >
> > Hmm, I may have to consider that. My concern is that IRQ's switch-stack works
> > well just because no-task-switch in IRQ routine. (I'm sorry if I misunderstand.)
> >
> > One possibility for memcg will be limit the number of reclaimers who can use
> > __GFP_FS and use shared stack per cpu per memcg.
> >
> > Hmm. yet another per-memcg memory shrinker may sound good. 2 years ago, I wrote
> > a patch to do high-low-watermark memory shirker thread for memcg.
> >
> > - limit
> > - high
> > - low
> >
> > start memory reclaim/writeback when usage exceeds "high" and stop it is below
> > "low". Implementing this with thread pool can be a choice.
> >
>
> Indeed, maybe something like a kswapd-memcg thread that is shared between
> a configurable number of containers?
>
yes, I consider that style. I like something automatic configration but peopl
may want knobs.



> >
> > > In the meantime for my own series, memcg now treats dirty pages similar to
> > > lumpy reclaim. It asks flusher threads to clean pages but stalls waiting
> > > for those pages to be cleaned for a time. This is an untested patch on top
> > > of the current series.
> > >
> >
> > Wow...Doesn't this make memcg too slow ?
>
> It depends heavily on how often dirty pages are being written back by direct
> reclaim. It's not ideal but stalling briefly is better than crashing.
> Ideally, the number of dirty pages encountered by direct reclaim would
> be so small that it wouldn't matter so I'm looking into that.
>
ok.

> > Anyway, memcg should kick flusher
> > threads..or something, needs other works, too.
> >
>
> With this patch, the flusher threads get kicked when direct reclaim encounters
> pages it cannot clean.
>
Ah, I missed that. thanks.

-Kame


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/