Re: Followup: [PATCH -mm] make swapin readahead skip over holes

From: Rik van Riel
Date: Tue Apr 17 2012 - 15:26:34 EST

On 04/17/2012 11:20 AM, Dan Magenheimer wrote:

In other words, you are both presuming a "swap workload"
that is more sequential than random for which this patch
improves performance, and assuming a "swap device"
for which the cost of a seek is high enough to overcome
the costs of filling the swap cache with pages that won't
be used.

Indeed, on spinning media the cost of seeking to
a cluster and reading one page is essentially the
same as the cost of seeking to a cluster and
reading the whole thing.

While it is easy to write a simple test/benchmark that
swaps a lot (and we probably all have similar test code
that writes data into a huge bigger-than-RAM array and then
reads it back), such a test/benchmark is usually sequential,
so one would assume most swap testing is done with a
sequential-favoring workload.

Lots of programs allocate fairly large memory
objects, and access them again in the same
large chunks.

Take a look at a desktop application like a
web browser, for example.

The kernbench workload
apparently exercises swap quite a bit more randomly and
your patch makes it run slower for low and high levels
of swapping, while faster for moderate swapping.

The kernbench workload consists of a large number
of fairly small, short lived processes. I suspect
this is a very non-typical workload to run into
swap, on today's systems.

A more typical workload consists of multiple large
processes, with the working set moving from one
part of memory (now inactive) to somewhere else.

We need to maximize swap IO throughput in order to
allow the system to quickly move to the new working

I also suspect (without proof) that the patch will
result in lower performance on non-rotating devices, such
as SSDs.

(Sure one can change the swap cluster size to 1, but how
many users or even sysadmins know such a thing even
exists... so the default is important.)

If the default should be changed for some systems,
that is worth doing.

How does your test run with smaller swap cluster

Would a swap cluster of 4 or 5 be closer to optimal
for a 1GB system?

All rights reversed
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at