Re: page swap allocation error/failure in 2.6.25

From: Alex Samad
Date: Tue Jul 29 2008 - 05:58:54 EST


On Tue, Jul 29, 2008 at 10:14:01AM +0100, Mel Gorman wrote:
> On (29/07/08 10:06), Alex Samad didst pronounce:
> > On Mon, Jul 28, 2008 at 12:04:47PM +0200, Peter Zijlstra wrote:
> > > On Sun, 2008-07-27 at 16:07 +1000, Alex Samad wrote:
> > > > On Fri, Jul 25, 2008 at 09:40:01AM +0200, Peter Zijlstra wrote:
> > > > > On Fri, 2008-07-25 at 17:20 +1000, Alex Samad wrote:
> > > > > > Hi
> > > >
> > > > [snip]
> > > >
> > > > >
> > > > >
> > > > > Its harmless if it happens sporadically.
> > > > >
> > > > > Atomic order 2 allocations are just bound to go wrong under pressure.
> > > > can you point me to any doco that explains this ?
> > >
> > > An order 2 allocation means allocating 1<<2 or 4 physically contiguous
> > > pages. Atomic allocation means not being able to sleep.
> > >
> > > Now if the free page lists don't have any order 2 pages available due to
> > > fragmentation there is currently nothing we can do about it.
> >
> > Strange cause I don't normal have a high swap usage, I have 2G ram and
> > 2G swap space. There is not that much memory being used squid, apache is
> > about it.
> >
>
> The problem is related to fragmentation. Look at /proc/buddinfo and
> you'll see how many pages are free at each order. Now, the system can
> deal with fragmentation to some extent but it requires the caller to be
> able to perform IO, enter the FS and sleep.
>
> An atomic allocation can do none of those. High-order atomic allocations
> are almost always due to a network card using a large MTU that cannot

I definitely use higher mtu on my network

> receive a packet into many page-sized buffers. Their requirement of
> high-order atomic allocations is fragile as a result.
>
> You *may* be able to "hide" this by increasing min_free_kbytes as this
> will wake kswapd earlier. If the waker of kswapd had requested a high-order
> buffer then kswapd will reclaim at that order as well. However, there are
> timing issues involved (e.g. the network receive needs to enter the path
> that wakes kswapd) and it could have been improved upon.
>
> > > I've been meaning to try and play with 'atomic' page migration to try
> > > and assemble a higher order page on demand with something like memory
> > > compaction.
> > >
> > > But its never managed to get high enough on the todo list..
> > >
>
> Same here. I prototyped memory compaction a while back and the feeling at
> the time was that it could be made atomic with a bit of work but I never got
> around to pushing it further. Part of this was my feeling that any attempt
> to make high-order atomic allocations more reliable would be frowned upon
> as encouraging bad behaviour from device driver authors.
>
> --
> Mel Gorman
> Part-time Phd Student Linux Technology Center
> University of Limerick IBM Dublin Software Lab
>

--
Disks travel in packs.

Attachment: signature.asc
Description: Digital signature