Re: Terrible disk performance when files cached > 4GB

From: Minchan Kim
Date: Fri Apr 15 2016 - 09:56:31 EST


On Fri, Apr 15, 2016 at 10:20:33AM +0100, Colum Paget wrote:
> Hi all,
>
> I suspect that many people will have reported this, but I thought I'd drop you
> a line just in case everyone figures someone else has reported it. It's
> possible we're just doing something wrong and so encountering this problem,
> but I can't find anyone saying they've found a solution, and the problem
> doesn't seem to be present in 3.x kernels, which makes us think it could be a
> bug.
>
> We are seeing a problem in 4.4.5 and 4.4.6 32-bit 'hugemem' kernels running on
> machines with > 4GB ram. The problem results in disk performance dropping
> from 120 MB/s to 1MB/s or even less. 3.18.x 32-bit kernels do not seem to
> exhibit this behaviour, or at least we can't make it happen reliably. We've
> tried 3.14.65 and 3.14.65 and they don't exhibit the same degree of problem.
> We've not yet been able to test 64 bit kernels, it will be a while before we
> can. We've been able to reproduce the problem on multiple machines with
> different hardware configs, and with different kernel configs as regards
> SMP , NUMA support and transparent hugepages.
>
> This problem can be reproduced thusly:
>
> Unpack/transfer a *large* number of files onto disk. As they unpack one can
> monitor the amount of memory being used for file caching with 'free'. Disk
> transfer speeds can be tested by 'dd'-ing a large file locally. Initially the
> transfer rate for this file will be over 100GB/s. However, when the amount of
> cached memory exceeds some figure (this was 4GB on some systems, 10GB on
> others) disk performance will start to dramatically degrade. Very swiftly the
> disks become unusable.
>
> On some machines this situation can be recovered by:
>
> echo 3 > /proc/sys/vm/drop_caches
>
> However, we've seen some cases where even this doesn't seem to help, and the
> machine has to be rebooted.
>
> We believe the problem is that the memory cache gets so big that searching
> through it becomes slower than reading files directly off disk. One problem
> with this theory is that we're always copying the same file over and over in
> our tests, so the file is unlikely to be a 'cache miss', personally I would
> have expected performance to only be bad for cache misses, but it's bad for
> everything, so maybe our theory is wrong.
>
> For our purposes, we're fine running with 3.14.x series kernels, but I thought
> I should let you know.
>
> regards,
>
> Colum

Did you see this patch?

https://lkml.org/lkml/2016/4/3/237

It fixes a bug 6b4f7799c6a5 ("mm: vmscan: invoke slab shrinkers from shrink_zone()")
introduced and 6b4f7799c6a5 was applied to v3.19. IOW, until 3.18, it was okay.

Thanks.