Re: memory handling in pre5/pre6

From: Andrea Arcangeli (andrea@suse.de)
Date: Tue Apr 18 2000 - 19:32:04 EST


On Tue, 18 Apr 2000, Rik van Riel wrote:

>Please read Stephen Tweedie's mail to you. There is a clear bug
>in truncate_inode_pages() and this patch should solve it. Maybe
>this is the wrong fix, I'm open to suggestions on that, but
>simply asserting that there is no bug (when there clearly is)
>isn't going to do any good.

I'm looking into it.

>[..] the new and the old
>patch are only very subtly different. [..]

Subtle things make huge differences.

>[..] Dismissing my new patch because of
>a mistake in the old patch isn't going to get us a better 2.4
>kernel...

I wasn't dismissing your new patch because of the previous one. I
commented the new patch separately. You were the one that was mixing the
talking of the two patches. I want and wanted to consider them separately
as far I can tell.

>> The lru should be global. It make no sense to keep in 16mbyte of cache
>> very obsolete stuff. The fix is to put the lru into the NUMA structure
>> instead of in the zone. That's another thing I want to fix.
>
>You must be the only one who wants this. Linus has already said
>that he wants to make memory management a per-zone thing...

Ok perfect, so first free from the DMA zone and stop if you freed enough
memory from there. Now, what we do is broken.

However it's not obvious to me the right thing to do is to penalize the
cache in low memory because of the DMA reliability issues. I see instead
why we do _want_ to have a per-NUMA-node page LRU (I never complained
about that).

>> > dispose = &zone->lru_cache;
>> >+ /* the page is in use, we can't free it now */
>> >+ if (!page->buffers && page_count(page) > 1)
>> >+ goto dispose_continue;
>>
>> you put busy pages into the lru and you'll lockup as soon as
>> only busy pages will remains into the lru.
>
>You seem to forget that we only see stuff that's already in the
>LRU cache. We don't "put anything in", we simply put these pages
>back where they came from. This is a very surprising comment
>from you since you're the guy who wrote the original LRU code
>in the first place...

Be quite about that, I see well what you're doing and that's the same
thing I'm doing in the page-was-referenced path. What you're doing is:

        if ((entry = lru->prev))
        {
                unlink(entry);
                page = entry_to_page(entry);
                if (page_busy(page)) {
                        relink(entry);
                        continue;
                }
        }

and it's design very near to deadlock. Actually you're also decreasing the
`count` before relinking and so you won't lockup ;), but the code is still
wrong and you are missing enterely the point of the local dispose lists.

The reason that we wasn't using that design is that we don't want a
parallel shrink_mmap to waste time on pages that we just found busy as we
don't want to waste time on the busy page for a second time if somebody
grow the lru cache from under us.

If you see the only place where I'm re-inserting pages into the lru
immediatly is when I roll the pages for the reference bit. And that's not
cpu-waste prone because the second time I'll process the page for real and
the page will be kept out of the lru later.

Remeber the lru will change from under you (can grow and be shrunk in
parlallel so you can't rely on the count to avoid processing the same page
twice).

>Nope. The "old" list is inserted at the back of the LRU list,
>so it'll be scanned *last* on the next call to shrink_mmap().

With your patch they would be processed _first_ on the next call of
shrink_mmap instead! I would have called such dispose list as "young" and
not "old" if they would been processed last ;). If you were right about
that the current page-aging handling would be badly broken indeed.

>This is exactly how LRU is supposed to work and a number of
>people have already seen quite a performance gain on their
>system after having applied my new patch.

Mind to show some number?

Andrea

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Apr 23 2000 - 21:00:14 EST