rsync: page allocation stalls in kernel 4.9.10 to a VessRAID NAS

From: Robert Kudyba
Date: Mon Feb 27 2017 - 17:10:37 EST


This sounds very similar to David Arendtâs problem from December 2016 on 12/13 post numbered 529 in the URL with the title page allocation stall in kernel 4.9 when copying files from one btrfs hdd to another

This happens about 2 hours after our daily rsnapshot/rsync cron job starts pretty much every time. And I set these parameters in /etc/sysctl.conf on a Fedora 25 server:
net.ipv4.tcp_window_scaling=0
net.ipv4.tcp_timestamps=0
net.ipv4.tcp_tw_recycle=0
vm.zone_reclaim_mode=1
vm.min_free_kbytes=131072
# write-cache, foreground/background flushing
vm.dirty_ratio = 3
# vm.dirty_background_ratio = 5 (% of RAM)
vm.dirty_background_ratio = 1
# make it 10 sec
vm.dirty_writeback_centisecs = 1000

Feb 27 04:36:54 curie kernel: rsync: page allocation stalls for 10699ms, order:0, mode:0x2420848(GFP_NOFS|__GFP_NOFAIL|__GFP_HARDWALL|__GFP_MOVABLE)
Feb 27 04:36:54 curie kernel: CPU: 2 PID: 32649 Comm: rsync Tainted: G L 4.9.10-200.fc25.i686+PAE #1
Feb 27 04:36:54 curie kernel: Hardware name: Dell Computer Corporation PowerEdge 1850/0D8266, BIOS A07 04/25/2008
Feb 27 04:36:54 curie kernel: ec725bac cd76bd10 cdd774bc 00000000 ec725bdc cd584b96 cdd7bc69 02420848
Feb 27 04:36:54 curie kernel: ec725be4 ec725bec cdd774bc ec725bc0 a0988a17 cdef9c48 ffffd635 00000000
Feb 27 04:36:54 curie kernel: ec725ca4 cd585686 02420848 cdd774bc 000029cb 00000000 00000010 00000050
Feb 27 04:36:54 curie kernel: Call Trace:
Feb 27 04:36:54 curie kernel: [<cd76bd10>] dump_stack+0x58/0x78
Feb 27 04:36:54 curie kernel: [<cd584b96>] warn_alloc+0xf6/0x110
Feb 27 04:36:54 curie kernel: [<cd585686>] __alloc_pages_nodemask+0xa66/0xd30
Feb 27 04:36:54 curie kernel: [<cd57c212>] ? find_get_entry+0x22/0x160
Feb 27 04:36:54 curie kernel: [<cd57cb92>] pagecache_get_page+0xc2/0x2d0
Feb 27 04:36:54 curie kernel: [<cd61f3a9>] __getblk_gfp+0x109/0x330
Feb 27 04:36:54 curie kernel: [<cd673a73>] ext4_getblk+0xa3/0x1b0
Feb 27 04:36:54 curie kernel: [<cd673ba3>] ext4_bread+0x23/0xb0
Feb 27 04:36:54 curie kernel: [<cd67dc97>] __ext4_read_dirblock+0x27/0x3a0
Feb 27 04:36:54 curie kernel: [<cd5f50f6>] ? lookup_fast+0x46/0x2d0
Feb 27 04:36:54 curie kernel: [<cd67e652>] htree_dirblock_to_tree+0x52/0x260
Feb 27 04:36:54 curie kernel: [<cd60a690>] ? mntput+0x20/0x40
Feb 27 04:36:54 curie kernel: [<cd5f431f>] ? terminate_walk+0x8f/0x100
Feb 27 04:36:54 curie kernel: [<cd67f730>] ext4_htree_fill_tree+0x90/0x2e0
Feb 27 04:36:54 curie kernel: [<cd5f9aa4>] ? putname+0x54/0x60
Feb 27 04:36:54 curie kernel: [<cd66b9d1>] ? ext4_readdir+0x9d1/0xa60
Feb 27 04:36:54 curie kernel: [<cd66b9d1>] ? ext4_readdir+0x9d1/0xa60
Feb 27 04:36:54 curie kernel: [<cd66ab46>] ? free_rb_tree_fname+0x16/0x70
Feb 27 04:36:54 curie kernel: [<cd66b7a4>] ext4_readdir+0x7a4/0xa60
Feb 27 04:36:54 curie kernel: [<cd77a090>] ? _copy_to_user+0x40/0x50
Feb 27 04:36:54 curie kernel: [<cd6ee4bf>] ? security_file_permission+0x9f/0xc0
Feb 27 04:36:54 curie kernel: [<cd5fe259>] iterate_dir+0x179/0x1a0
Feb 27 04:36:54 curie kernel: [<cd5f00d0>] ? SyS_fstat64+0x30/0x50
Feb 27 04:36:54 curie kernel: [<cd5fe83b>] SyS_getdents64+0x7b/0x100
Feb 27 04:36:54 curie kernel: [<cd5fe280>] ? iterate_dir+0x1a0/0x1a0
Feb 27 04:36:54 curie kernel: [<cd40378a>] do_fast_syscall_32+0x8a/0x150
Feb 27 04:36:54 curie kernel: [<cdb723ca>] sysenter_past_esp+0x47/0x75
Feb 27 04:36:54 curie kernel: Mem-Info:
Feb 27 04:36:55 curie kernel: active_anon:2091 inactive_anon:1298 isolated_anon:0#012 active_file:268004 inactive_file:1386276 isolated_file:0#012 unevictable:1986 dirty:0 writeback:0 unstable:0#012 slab_reclaimable:144934 slab_unreclaimable:7392#012 mapped:14331 shmem:170 pagetables:460 bounce:0#012 free:252753 free_pcp:470 free_cma:0
Feb 27 04:36:55 curie kernel: Node 0 active_anon:8364kB inactive_anon:5192kB active_file:1072016kB inactive_file:5545104kB unevictable:7944kB isolated(anon):0kB isolated(file):0kB mapped:57324kB dirty:0kB writeback:0kB shmem:680kB writeback_tmp:0kB unstable:0kB pages_scanned:171672794 all_unreclaimable? yes
Feb 27 04:36:55 curie kernel: DMA free:5584kB min:2580kB low:3224kB high:3868kB active_anon:0kB inactive_anon:0kB active_file:148kB inactive_file:0kB unevictable:0kB writepending:0kB present:15996kB managed:15916kB mlocked:0kB slab_reclaimable:10032kB slab_unreclaimable:120kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Feb 27 04:36:55 curie kernel: lowmem_reserve[]: 0 751 8055 8055
Feb 27 04:36:55 curie kernel: Normal free:128408kB min:128488kB low:160608kB high:192728kB active_anon:0kB inactive_anon:0kB active_file:44340kB inactive_file:4720kB unevictable:0kB writepending:0kB present:892920kB managed:791608kB mlocked:0kB slab_reclaimable:569704kB slab_unreclaimable:29448kB kernel_stack:1168kB pagetables:0kB bounce:0kB free_pcp:1568kB local_pcp:236kB free_cma:0kB
Feb 27 04:36:55 curie kernel: lowmem_reserve[]: 0 0 58430 58430
Feb 27 04:36:55 curie kernel: HighMem free:877020kB min:512kB low:303996kB high:607480kB active_anon:8364kB inactive_anon:5192kB active_file:1027528kB inactive_file:5540384kB unevictable:7944kB writepending:0kB present:7479040kB managed:7479040kB mlocked:7944kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:1840kB bounce:0kB free_pcp:312kB local_pcp:196kB free_cma:0kB
Feb 27 04:36:55 curie kernel: lowmem_reserve[]: 0 0 0 0
Feb 27 04:36:55 curie kernel: DMA: 16*4kB (ME) 26*8kB (UME) 8*16kB (UE) 2*32kB (ME) 2*64kB (UM) 1*128kB (U) 3*256kB (UME) 2*512kB (UM) 3*1024kB (UME) 0*2048kB 0*4096kB = 5584kB
Feb 27 04:36:55 curie kernel: Normal: 556*4kB (UME) 1269*8kB (UEH) 568*16kB (UE) 362*32kB (UME) 116*64kB (UMEH) 23*128kB (UMEH) 66*256kB (MEH) 69*512kB (ME) 22*1024kB (ME) 5*2048kB (M) 0*4096kB = 128408kB
Feb 27 04:36:55 curie kernel: HighMem: 6301*4kB (UM) 3861*8kB (UM) 1616*16kB (UM) 1468*32kB (UM) 247*64kB (UM) 237*128kB (UM) 462*256kB (UM) 476*512kB (UM) 228*1024kB (UM) 6*2048kB (UM) 23*4096kB (UM) = 877020kB
Feb 27 04:36:55 curie kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Feb 27 04:36:55 curie kernel: 1656105 total pagecache pages
Feb 27 04:36:55 curie kernel: 81 pages in swap cache
Feb 27 04:36:55 curie kernel: Swap cache stats: add 6469, delete 6388, find 28828/29500
Feb 27 04:36:55 curie kernel: Free swap = 8352872kB
Feb 27 04:36:55 curie kernel: Total swap = 8355836kB
Feb 27 04:36:55 curie kernel: 2096989 pages RAM
Feb 27 04:36:55 curie kernel: 1869760 pages HighMem/MovableOnly
Feb 27 04:36:55 curie kernel: 25348 pages reserved
Feb 27 04:36:55 curie kernel: 0 pages hwpoisoned