Re: VM in v2.4.0test9

From: Rik van Riel (riel@conectiva.com.br)
Date: Wed Oct 04 2000 - 15:02:31 EST


On Wed, 4 Oct 2000, Roger Larsson wrote:
> Rik van Riel wrote:
> > On Wed, 4 Oct 2000, Rik van Riel wrote:
> >
> > > > > First, you have MORE free memory than freepages.high. In this
> > > > > case I really don't see why __alloc_pages() wouldn't give the
> > > > > memory to your processes ....
> > > >
> > > > Hmm...
> > > > Can't it be a zone problem?
> > > > Free pages is the total free - all zones.
> > > > But suppose you want a page from a specific zone - DMA, the more
> > > > memory you have the less likely that you have a DMA page free...
> > > > Does all test take this into consideration?
> > >
> > > Free_shortage() /should/ take this into consideration, and unless
> > > I'm wrong, it does ;)
> >
> > Also, a zone problem CANNOT cause the problem in
> > David's 16MB test ...
> >
> > (this is getting stranger and stranger)
>
> Now I know from where the 125 pages limit comes from.
> static int zone_balance_ratio[MAX_NR_ZONES] = { 32, 128, 128, };
> 16M/4k/32 = 125
>
> Probably there is a mismatch between zone->free_pages and
> free_pages.{min,low,high}

Arghhhhhhhhhhhhhhhhhhhhhhhhh....

The potential for this bug has been around since 2.3.51, when
different balance_ratios for different zones became possible.

The bug hasn't bitten us yet since then because 1) the balance
ratio for ZONE_DMA wasn't changed until some time later and
2) we didn't use freepages.{min,low,high}.

Now that we /are/ using the values in the freepages array again,
we're running into the very big problem that freepages.high has
a value LOWER than zone->pages_min for the DMA zone, on 16MB
machines. This caused David's system to behave the way it did.

Does the attached patch fix the problem?

regards,

Rik

--
"What you're running that piece of shit Gnome?!?!"
       -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/ http://www.surriel.com/

--- mm/page_alloc.c.orig Tue Oct 3 10:20:41 2000 +++ mm/page_alloc.c Wed Oct 4 17:01:52 2000 @@ -795,21 +795,6 @@ printk("On node %d totalpages: %lu\n", nid, realtotalpages); - /* - * Select nr of pages we try to keep free for important stuff - * with a minimum of 10 pages and a maximum of 256 pages, so - * that we don't waste too much memory on large systems. - * This is fairly arbitrary, but based on some behaviour - * analysis. - */ - i = realtotalpages >> 7; - if (i < 10) - i = 10; - if (i > 256) - i = 256; - freepages.min += i; - freepages.low += i * 2; - freepages.high += i * 3; memlist_init(&active_list); memlist_init(&inactive_dirty_list); @@ -875,6 +860,20 @@ zone->pages_min = mask; zone->pages_low = mask*2; zone->pages_high = mask*3; + /* + * Add these free targets to the global free target; + * we have to be SURE that freepages.high is higher + * than SUM [zone->pages_min] for all zones, otherwise + * we may have bad bad problems. + * + * This means we cannot make the freepages array writable + * in /proc, but have to add a separate extra_free_target + * for people who require it to catch load spikes in eg. + * gigabit ethernet routing... + */ + freepages.min += mask; + freepages.low += mask*2; + freepages.high += mask*3; zone->zone_mem_map = mem_map + offset; zone->zone_start_mapnr = offset; zone->zone_start_paddr = zone_start_paddr;

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Oct 07 2000 - 21:00:14 EST