Re: Race in new page migration code?

From: Christoph Lameter
Date: Sat Jan 14 2006 - 13:01:46 EST


On Sat, 14 Jan 2006, Nick Piggin wrote:

> I'm fairly sure there is a race in the page migration code
> due to your not taking a reference on the page. Taking the
> reference also can make things less convoluted.

We take that reference count on the page:

/*
* Isolate one page from the LRU lists.
*
* - zone->lru_lock must be held
*/
static inline int __isolate_lru_page(struct page *page)
{
if (unlikely(!TestClearPageLRU(page)))
return 0;

>>>> if (get_page_testone(page)) {
/*
* It is being freed elsewhere
*/
__put_page(page);
SetPageLRU(page);
return -ENOENT;
}

return 1;
}


> Also, an unsuccessful isolation attempt does not mean something is
> wrong. I removed the WARN_ON, but you should probably be retrying
> on this level if you are really interested in migrating all pages.

It depends what you mean by unsuccessful isolate attempt. One reason for
not being successful is that the page has been freed. Thats okay.

The other is that the page is not on the LRU, and is not being moved
back to the LRU by draining the lru caches. In that case we need to
have a WARN_ON at least for now. There may be other reasons that a page
is not on the LRU but I would like to be careful about that at first.

Its not an error but something that is of concern thus WARN_ON.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/