Re: rsync: page allocation stalls in kernel 4.9.10 to a VessRAID NAS

From: Michal Hocko
Date: Tue Feb 28 2017 - 10:15:44 EST


On Tue 28-02-17 09:59:35, Robert Kudyba wrote:
>
> > On Feb 28, 2017, at 9:40 AM, Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> >
> > On Tue 28-02-17 09:33:49, Robert Kudyba wrote:
> >>
> >>> On Feb 28, 2017, at 9:15 AM, Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> >>> and this one is hitting the min watermark while there is not really
> >>> much to reclaim. Only the page cache which might be pinned and not
> >>> reclaimable from this context because this is GFP_NOFS request. It is
> >>> not all that surprising the reclaim context fights to get some memory.
> >>> There is a huge amount of the reclaimable slab which probably just makes
> >>> a slow progress.
> >>>
> >>> That is not something completely surprsing on 32b system I am afraid.
> >>>
> >>> Btw. is the stall repeating with the increased time or it gets resolved
> >>> eventually?
> >>
> >> Yes and if you mean by repeating itâs not only affecting rsync but
> >> you can see just now automount and NetworkManager get these page
> >> allocation stalls and kswapd0 is getting heavy CPU load, are there any
> >> other settings I can adjust?
> >
> > None that I am aware of. You might want to talk to FS guys, maybe they
> > can figure out who is pinning file pages so that they cannot be
> > reclaimed. They do not seem to be dirty or under writeback. It would be
> > also interesting to see whether that is a regression. The warning is
> > relatively new so you might have had this problem before just haven't
> > noticed it.
>
> We have been getting out of memory errors for a while but those seem
> to have gone away.

this sounds suspicious. Are you really sure that this is a new problem?
Btw. is there any reason to use 32b kernel at all? It will always suffer
from a really small lowmem...

> We did just replace the controller in the VessRAID
> as there were some timeouts observed and multiple login/logout
> attempts.
>
> By FS guys do you mean the linux-fsdevel or linux-fsf list?

yeah linux-fsdevel. No idea what linux-fsf is. It would be great if you
could collect some tracepoints before reporting the issue. At least
those in events/vmscan/*.

--
Michal Hocko
SUSE Labs