PageSwapCache problem in try_to_swap_out()

From: Rajagopal Ananthanarayanan (ananth@sgi.com)
Date: Wed Apr 26 2000 - 20:01:06 EST


I ran into a BUG in __free_pages_ok which checks:

----------
        if (PageSwapCache(page))
                BUG();
----------

The call to free the page was from try_to_swap_out():

----------
        /*
         * Is the page already in the swap cache? If so, then
         * we can just drop our reference to it without doing
         * any IO - it's already up-to-date on disk.
         *
         * Return 0, as we didn't actually free any real
         * memory, and we should just continue our scan.
         */
        if (PageSwapCache(page)) {
                entry.val = page->index;
                swap_duplicate(entry);
                set_pte(page_table, swp_entry_to_pte(entry));
drop_pte:
                vma->vm_mm->rss--;
                flush_tlb_page(vma, address);
                __free_page(page);
                goto out_failed;
        }
-----------

The entire trace from kdb is as follows (XXX = unknown):

----------
XXXXXXXXXX XXXXXXXXXX __free_pages_ok(XXXX)
0xc35d5d9c 0xc013543c try_to_swap_out+0xc8( 0xc179dc20, 0x43589000, 0xc2963624,
0x5,
0xc179dc20 )
0xc35d5dd8 0xc0135710 swap_out_vma+0x11c( 0xc179dc20, 0x432d4000, 0x5, 0xc02c7c68,
0xc3256520 )
0xc35d5df8 0xc01357ee swap_out_mm+0x7e( 0xc3256520, 0x5, 0x4, 0x6, 0x5 )
0xc35d5e24 0xc01359de swap_out+0x176( 0x6, 0x5, 0xc35d4000, 0xc02d0890, 0x5 )
0xc35d5e40 0xc0135b31 do_try_to_free_pages+0x89( 0x5, 0xc02d06b8, 0xc02d06b8,
0xc35d5e78,
0xc013666b )
0xc35d5e54 0xc0135d17 try_to_free_pages+0x2b( 0x5, 0xc02d06b8, 0xc02d0898, 0x0,
0xc02d088c )
0xc35d5e78 0xc013666b zone_balance_memory+0x63( 0xc02d088c, 0xc35d4000, 0x0,
0x5b5f00 )
0xc35d5e98 0xc0136724 __alloc_pages+0x80( 0xc35d4000, 0x0, 0xc35d4000 )
0xc35d5eac 0xc0136dbe read_swap_cache_async+0x62( 0x5b5f00, 0x1, 0x5b5f00,
0x5b5f00,
0xc13919c4 )
0xc35d5ecc 0xc012889b do_swap_page+0x97( 0xc35d4000, 0xc179dc20, 0x45671fff,
0xc13919c4,
0x5b5f00 )
0xc35d5efc 0xc0128ce7 handle_mm_fault+0x13b( 0xc35d4000, 0xc179dc20, 0x45671fff,
0x0,
0xc35d4000 )
0xc35d5fb4 0xc011513e do_page_fault+0x18e
[1]more>
0xbffff540 0xc010a681 error_code+0x2d
-------------

I have a kdb helper function which prints out some of the fields in the page,
and also below is the hex dump of the "struct page"

----------
  [1]kdb> page 0xc1002de0
struct page at 0xc1002de0
  next 0xc1042ee0 prev 0xc101e900 addr space 0xc02d0520 index 557568
  count 1 flags PG_uptodate PG_swap_cache PG_swap_entry virtual 0xc0092000
  buffers 0x00000000 block_map 00000000000000000000000000000000
    [ ... ]
                         
c1002de0: c1042ee0 c101e900 c02d0520 00088200 .... .-....
c1002df0: 00000000 00000001 00000a08 c1042efc ..............
c1002e00: c101e91c 00000000 dead4ead c1002e0c ......N...
c1002e10: c1002e0c c1002e14 c02f3b3f c1150e5c ......?;/\..
c1002e20: 00000000 c0092000 c02d0600 00000000 ..... ...-....
---------------------

Question: Is this a problem in the reference count on the page?
If indeed the page can be freed by the call in try_to_swap_out,
then the test in __free_pages_ok will trigger every time this path
is taken. Any one have ideas as to what's wrong?

BTW, the above happened during a relatively normal operation of
using 'diff'. Don't know if it reproducible.

thanks,

ananth.

-- 
--------------------------------------------------------------------------
Rajagopal Ananthanarayanan ("ananth")
Member Technical Staff, SGI.
--------------------------------------------------------------------------

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Apr 30 2000 - 21:00:12 EST