Re: [RFC][PATCH 1/3] radix priority search tree - objrmap complexity fix

From: Andrea Arcangeli
Date: Sat Apr 03 2004 - 12:45:16 EST


JFYI, swap suspend now works fine with rc3-aa3 that includes the below
and shutdown the harmless warnings generated by the first
gfp-no-compound patch. This also means the -mm writeback fixes doing
the additional page_cache_get to keep the remove_exclusive_swap_cache
away, worked fine too and you should merge that fix in -mm.

I believe Christoph's oops on ppc, is either the missign tlb flushing
support (fixed by Hugh in the last patch I posted to the list) or some
other ppc issue and it's the last pending known bug (xfs also has a bug
in truncate exposed by a bugcheck in objrmap, but it's not a vm issue
and its fix should be almost ready too).

I'm very convinced that the alloc_pages API should be the same for all
archs w/o or w/ MMU, and I'm fine if we want to make the non-compound
retval the default (and change __GFP_NO_COMP to __GFP_COMP) in the long
run (to optimize all callers but hugetlbfs). For the short term
__GFP_NO_COMP and compound being the default is the safest (for all
archs).

On Fri, Apr 02, 2004 at 06:46:34PM +0200, Andrea Arcangeli wrote:
> On Fri, Apr 02, 2004 at 10:43:34AM +0100, Christoph Hellwig wrote:
> > I got lots of the following OOPSEs with 2.6.5-rc3aa2 on a powerpc running
> > the xfs testsuite (with the truncate fix applied):
> >
> > Apr 2 13:27:21 bird kernel: Bad page state at destroy_compound_page (in process 'swapper', page c08d9920)
> > Apr 2 13:27:21 bird kernel: flags:0x00000008 mapping:00000000 mapped:0 count:0
> > Apr 2 13:27:21 bird kernel: Backtrace:
> > Apr 2 13:27:21 bird kernel: Call trace:
> > Apr 2 13:27:21 bird kernel: [c000b5c8] dump_stack+0x18/0x28
> > Apr 2 13:27:21 bird kernel: [c0048b60] bad_page+0x70/0xb0
> > Apr 2 13:27:21 bird kernel: [c0048c70] destroy_compound_page+0x80/0xb8
>
> it's not clear why this triggered, bad_page only shows the "master"
> compound page and not the contents of the slave page that triggered the
> bad_page. Can you try again with this incremental patch applied?
> Thanks!
>
> --- x/mm/page_alloc.c.~1~ 2004-04-02 05:24:50.000000000 +0200
> +++ x/mm/page_alloc.c 2004-04-02 18:32:53.189244408 +0200
> @@ -73,9 +73,9 @@ static void bad_page(const char *functio
> {
> printk(KERN_EMERG "Bad page state at %s (in process '%s', page %p)\n",
> function, current->comm, page);
> - printk(KERN_EMERG "flags:0x%08lx mapping:%p mapped:%d count:%d\n",
> + printk(KERN_EMERG "flags:0x%08lx mapping:%p mapped:%d count:%d private:0x%08lx\n",
> (unsigned long)page->flags, page->mapping,
> - page_mapped(page), page_count(page));
> + page_mapped(page), page_count(page), page->private);
> printk(KERN_EMERG "Backtrace:\n");
> dump_stack();
> printk(KERN_EMERG "Trying to fix it up, but a reboot is needed\n");
> @@ -137,9 +137,9 @@ static void destroy_compound_page(struct
> struct page *p = page + i;
>
> if (!PageCompound(p))
> - bad_page(__FUNCTION__, page);
> + bad_page(__FUNCTION__, p);
> if (p->private != (unsigned long)page)
> - bad_page(__FUNCTION__, page);
> + bad_page(__FUNCTION__, p);
> ClearPageCompound(p);
> }
> }
> @@ -272,8 +272,12 @@ void __free_pages_ok(struct page *page,
> int i;
>
> mod_page_state(pgfree, 1 << order);
> - for (i = 0 ; i < (1 << order) ; ++i)
> - free_pages_check(__FUNCTION__, page + i);
> + for (i = 0 ; i < (1 << order) ; ++i) {
> + struct page * _page = page + i;
> + if (unlikely(i))
> + __put_page(_page);
> + free_pages_check(__FUNCTION__, _page);
> + }
> list_add(&page->lru, &list);
> kernel_map_pages(page, 1<<order, 0);
> free_pages_bulk(page_zone(page), 1, &list, order);
> @@ -316,19 +320,21 @@ static void prep_new_page(struct page *
> (page->flags & (
> 1 << PG_private |
> 1 << PG_locked |
> - 1 << PG_lru |
> + 1 << PG_lru |
> 1 << PG_active |
> 1 << PG_dirty |
> 1 << PG_reclaim |
> 1 << PG_anon |
> 1 << PG_maplock |
> 1 << PG_swapcache |
> - 1 << PG_writeback )))
> + 1 << PG_writeback |
> + 1 << PG_compound )))
> bad_page(__FUNCTION__, page);
>
> page->flags &= ~(1 << PG_uptodate | 1 << PG_error |
> 1 << PG_referenced | 1 << PG_arch_1 |
> - 1 << PG_checked | 1 << PG_mappedtodisk);
> + 1 << PG_checked | 1 << PG_mappedtodisk |
> + 1 << PG_compound);
> page->private = 0;
> set_page_count(page, 1);
> }
>
>
> this incrmental bit made some harmless warning go away from swap resume,
> but it didn't fix swap resume completely yet OTOH I'm not sure anymore
> if there's any further VM issue or if it's a swap suspend issue. the
> PageCompound bugcheck would already trap any compound page in
> rw_swap_page_sync, so I'm sure nobody tried to swap compound pages in
> swap resume, and I'm also sure that the page->count is now correct, or
> free_pages_check would trigger. I cannot trigger any further bugcheck
> here (and the above patch only shutdown some false positive that
> couldn't hurt functionality, plus it adds further bugchecks).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/