Re: Out of Memory: Killed process 16498 (java).

From: Andrew Morton
Date: Fri Jan 20 2006 - 07:16:59 EST


Jens Axboe <axboe@xxxxxxx> wrote:
>
> On Fri, Jan 20 2006, Andrew Morton wrote:
> > Jens Axboe <axboe@xxxxxxx> wrote:
> > >
> > > On Thu, Jan 19 2006, Andrew Morton wrote:
> > > > Dave Jones <davej@xxxxxxxxxx> wrote:
> > > > >
> > > > > On Thu, Jan 19, 2006 at 03:11:45PM -0000, Andy Chittenden wrote:
> > > > > > DMA free:20kB min:24kB low:28kB high:36kB active:0kB inactive:0kB
> > > > > > present:12740kB pages_scanned:4 all_unreclaimable? yes
> > > > >
> > > > > Note we only scanned 4 pages before we gave up.
> > > > > Larry Woodman came up with this patch below that clears all_unreclaimable
> > > > > when in two places where we've made progress at freeing up some pages
> > > > > which has helped oom situations for some of our users.
> > > >
> > > > That won't help - there are exactly zero pages on ZONE_DMA's LRU.
> > > >
> > > > The problem appears to be that all of the DMA zone has been gobbled up by
> > > > the BIO layer. It seems quite inappropriate that a modern 64-bit machine
> > > > is allocating tons of disk I/O pages from the teeny ZONE_DMA. I'm
> > > > suspecting that someone has gone and set a queue's ->bounce_gfp to the wrong
> > > > thing.
> > > >
> > > > Jens, would you have time to investigate please?
> > >
> > > Certainly, I'll get this tested and fixed this afternoon.
> >
> > Wow ;)
> >
> > You may find it's an x86_64 glitch - setting max_[low_]pfn wrong down in
> > the bowels of the arch mm init code, something like that.
> >
> > I thought it might have been a regression which came in when we added
> > ZONE_DMA32 but the RH reporter is based on 2.6.14-<redhat stuff>, and he
> > didn't have ZONE_DMA32.
>
> Sorry, spoke too soon, I thought this was the 'bio/scsi leaks' which
> most likely is a scsi leak that also results in the bios not getting
> freed.
>
> This DMA32 zone shortage looks like a vm short coming, you're likely the
> better candidate to fix that :-)

It's not ZONE_DMA32. It's the 12MB ZONE_DMA which is being exhausted on
this 4GB 64-bit machine.

Andy put a dump_stack() into the oom code and it pointed at


Call Trace:<ffffffff8014d7bc>{out_of_memory+48}
<ffffffff8014f4b0>{__alloc_pages+536}
<ffffffff80169788>{bio_alloc_bioset+232}
<ffffffff80169d03>{bio_copy_user+218}
<ffffffff801bd657>{blk_rq_map_user+136}
<ffffffff801c0008>{sg_io+328}
<ffffffff801c047c>{scsi_cmd_ioctl+491}
<ffffffff88005e22>{:ide_core:generic_ide_ioctl+631}
<ffffffff88202d0c>{:sd_mod:sd_ioctl+371}
<ffffffff802a6db6>{schedule_timeout+158}
<ffffffff801bf165>{blkdev_ioctl+1365}
<ffffffff80243cb2>{sys_sendto+251}
<ffffffff801751e5>{__pollwait+0}
<ffffffff8016b16a>{block_ioctl+25}
<ffffffff801749f4>{do_ioctl+24} <ffffffff80174c46>{vfs_ioctl+541}
<ffffffff80174cb4>{sys_ioctl+89}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/