Re: memory handling in pre5/pre6

From: Rik van Riel (riel@conectiva.com.br)
Date: Wed Apr 19 2000 - 10:40:25 EST

Next message: Jeremy Fitzhardinge: "Re: An nfs/automount Oops"
Previous message: Linda Walsh: "Negative Reserved Return values (was Re: Proposal "LUID&Sess...)"
In reply to: Andrea Arcangeli: "Re: memory handling in pre5/pre6"
Next in thread: Andrea Arcangeli: "Re: memory handling in pre5/pre6"
Reply: Andrea Arcangeli: "Re: memory handling in pre5/pre6"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, 19 Apr 2000, Andrea Arcangeli wrote:
> On Tue, 18 Apr 2000, Rik van Riel wrote:

> >> The lru should be global. It make no sense to keep in 16mbyte of cache
> >> very obsolete stuff. The fix is to put the lru into the NUMA structure
> >> instead of in the zone. That's another thing I want to fix.
> >
> >You must be the only one who wants this. Linus has already said
> >that he wants to make memory management a per-zone thing...
>
> Ok perfect, so first free from the DMA zone and stop if you
> freed enough memory from there. Now, what we do is broken.

I've seen Stephen's mail on this and indeed, you're right. We
need to do LRU reclamation on a global basis.

The idea I have for this is that we have a *new* LRU queue,
one which holds only unmapped pages which can be freed in the
blink of an eye. When a page is referenced from that queue we
put it back in the shrink_mmap() lru queue and when the page is
needed for something else, we take it from the front of the
scavenge queue.

The size of the scavenge queue can be varied dynamically depending
on the ratio between soft-reclaims and pages which are scavenged for
another purpose. I'll get to work on this ASAP.

> Be quite about that, I see well what you're doing and that's the same
> thing I'm doing in the page-was-referenced path. What you're doing is:
>
> if ((entry = lru->prev))
> {
> unlink(entry);
> page = entry_to_page(entry);
> if (page_busy(page)) {
> relink(entry);
> continue;
> }
> }
>
> and it's design very near to deadlock. Actually you're also decreasing the
> `count` before relinking and so you won't lockup ;), but the code is still
> wrong and you are missing enterely the point of the local dispose lists.

Indeed, you're right here.

> The reason that we wasn't using that design is that we don't
> want a parallel shrink_mmap to waste time on pages that we just
> found busy as we don't want to waste time on the busy page for a
> second time if somebody grow the lru cache from under us.

Eurmmm, sorry, but the original code was doing something far worse
than this. It put *referenced* pages at the front of the LRU queue,
pages from the wrong zone in the old list and busy and locked pages
in the young list ... OOPS ;)

> If you see the only place where I'm re-inserting pages into the
> lru immediatly is when I roll the pages for the reference bit.
> And that's not cpu-waste prone because the second time I'll
> process the page for real and the page will be kept out of the
> lru later.

Which is wrong, because that way you'll steal the referenced bit
too fast and make it practically meaningless. The referenced bit
is there to indicate that the page was recently used and that we
should try to free other pages the next time.

> >Nope. The "old" list is inserted at the back of the LRU list,
> >so it'll be scanned *last* on the next call to shrink_mmap().
>
> With your patch they would be processed _first_ on the next call
> of shrink_mmap instead! I would have called such dispose list as
> "young" and not "old" if they would been processed last ;). If
> you were right about that the current page-aging handling would
> be badly broken indeed.

After this comment I looked at the code again and lo and behold,
we were *both* wrong ;) (the original code put pages in the wrong
place and my code did the same, only with different pages in a
different wrong place)

I'll be triple-checking the code before I'll send a new patch,
somewhere later today ;)

regards,

Rik

-- The Internet is not a network of computers. It is a network of people. That is its real strength.

Wanna talk about the kernel? irc.openprojects.net / #kernelnewbies http://www.conectiva.com/ http://www.surriel.com/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/

Next message: Jeremy Fitzhardinge: "Re: An nfs/automount Oops"
Previous message: Linda Walsh: "Negative Reserved Return values (was Re: Proposal "LUID&Sess...)"
In reply to: Andrea Arcangeli: "Re: memory handling in pre5/pre6"
Next in thread: Andrea Arcangeli: "Re: memory handling in pre5/pre6"
Reply: Andrea Arcangeli: "Re: memory handling in pre5/pre6"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Sun Apr 23 2000 - 21:00:15 EST