Re: [PATCH v2 3/3] percpu: improve allocation success rate for non-GFP_KERNEL callers

From: Tahsin Erdogan
Date: Mon Feb 27 2017 - 15:28:10 EST


Hi Tejun,

On Mon, Feb 27, 2017 at 11:51 AM, Tejun Heo <tj@xxxxxxxxxx> wrote:
>> __vmalloc+0x45/0x50
>> pcpu_mem_zalloc+0x50/0x80
>> pcpu_populate_chunk+0x3b/0x380
>> pcpu_alloc+0x588/0x6e0
>> __alloc_percpu_gfp+0xd/0x10
>> __percpu_counter_init+0x55/0xc0
>> blkg_alloc+0x76/0x230
>> blkg_create+0x489/0x670
>> blkg_lookup_create+0x9a/0x230
>> generic_make_request_checks+0x7dd/0x890
>> generic_make_request+0x1f/0x180
>> submit_bio+0x61/0x120
>
> As indicated by GFP_NOWAIT | __GFP_NOWARN, it's okay to fail there.
> It's not okay to fail consistently for a long time but it's not a big
> issue to fail occassionally even if somewhat bunched up. The only bad
> side effect of that is temporary misaccounting of some IOs, which
> shouldn't be noticeable outside of pathological cases. If you're
> actually seeing adverse effects of this, I'd love to learn about it.

A better example is the call path below:

pcpu_alloc+0x68f/0x710
__alloc_percpu_gfp+0xd/0x10
__percpu_counter_init+0x55/0xc0
cfq_pd_alloc+0x3b2/0x4e0
blkg_alloc+0x187/0x230
blkg_create+0x489/0x670
blkg_lookup_create+0x9a/0x230
blkg_conf_prep+0x1fb/0x240
__cfqg_set_weight_device.isra.105+0x5c/0x180
cfq_set_weight_on_dfl+0x69/0xc0
cgroup_file_write+0x39/0x1c0
kernfs_fop_write+0x13f/0x1d0
__vfs_write+0x23/0x120
vfs_write+0xc2/0x1f0
SyS_write+0x44/0xb0
entry_SYSCALL_64_fastpath+0x18/0xad

A failure in this call path gives grief to tools which are trying to
configure io
weights. We see occasional failures happen here shortly after reboots even
when system is not under any memory pressure. Machines with a lot of cpus
are obviously more vulnerable.