Re: [xfs-masters] Re: 2.6.22-rc1-mm1

From: Christoph Hellwig
Date: Thu May 17 2007 - 04:42:59 EST


On Thu, May 17, 2007 at 12:06:00PM +1000, David Chinner wrote:
> > static inline int put_page_testzero(struct page *page)
> > {
> > VM_BUG_ON(atomic_read(&page->_count) == 0);
> > return atomic_dec_and_test(&page->_count);
> > }
>
> I haven't seen that one. I expect that it will be the noaddr buffer allocation
> changes that have triggered this...

Yes. xfs_buf_get_noaddr calls xfs_buf_free to free a buffer when
something fails. But this is wrong - we want to call xfs_buf_deallocate
before we setup the page list, and if a page allocation fails we want to
do out own freeing of just the pages we allocated and call
_xfs_buf_free_pages. Currently we do our own freeing _and_ call
xfs_buf_free which leads to this double free.


Signed-off-by: Christoph Hellwig <hch@xxxxxx>


Index: linux-2.6/fs/xfs/linux-2.6/xfs_buf.c
===================================================================
--- linux-2.6.orig/fs/xfs/linux-2.6/xfs_buf.c 2007-05-17 09:34:44.000000000 +0200
+++ linux-2.6/fs/xfs/linux-2.6/xfs_buf.c 2007-05-17 09:36:53.000000000 +0200
@@ -792,8 +792,9 @@ xfs_buf_get_noaddr(
fail_free_mem:
while (--i >= 0)
__free_page(bp->b_pages[i]);
+ _xfs_buf_free_pages(bp);
fail_free_buf:
- xfs_buf_free(bp);
+ xfs_buf_deallocate(bp);
fail:
return NULL;
}

>
> > > [ 6666.719690] invalid opcode: 0000 [#1]
> > > [ 6666.723397] PREEMPT SMP
> > > [ 6666.725999] Modules linked in: xfs loop pktgen ipt_MASQUERADE iptable_nat nf_nat autofs4 af_packet nf_conntrack_netbios_ns ipt_REJECT nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 binfmt_misc thermal processor fan container nvram snd_intel8x0 snd_ac97_codec ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm evdev snd_timer snd soundcore intel_agp agpgart snd_page_alloc i2c_i801 ide_cd cdrom rtc unix
> > > [ 6666.776026] CPU: 0
> > > [ 6666.776027] EIP: 0060:[<c01693ec>] Not tainted VLI
> > > [ 6666.776028] EFLAGS: 00010202 (2.6.22-rc1-mm1 #3)
> > > [ 6666.788519] EIP is at put_page+0x44/0xee
> > > [ 6666.792491] eax: 00000001 ebx: c549f728 ecx: c04b27e0 edx: 00000001
> > > [ 6666.799345] esi: 00000000 edi: 00000080 ebp: d067e9e0 esp: d067e9c8
> > > [ 6666.806208] ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068
> > > [ 6666.812104] Process mount (pid: 9419, ti=d067e000 task=d00a4070 task.ti=d067e000)
> > > [ 6666.819486] Stack: d8980180 00000080 d067e9f0 d8980180 00000000 00000080 d067e9f0 fdc8eda3
> > > [ 6666.828103] fffffffc d8980180 d067ea20 fdc8f7ff fdc9b425 fdc96e5c 00080000 00000000
> > > [ 6666.836635] c549dfd0 00000200 ffffffff cd44b8e0 00002160 cd44b8e0 d067ea30 fdc78937
> > > [ 6666.845253] Call Trace:
> > > [ 6666.847939] [<fdc8eda3>] xfs_buf_free+0x41/0x61 [xfs]
> > > [ 6666.853247] [<fdc8f7ff>] xfs_buf_get_noaddr+0x10c/0x118 [xfs]
> > > [ 6666.859231] [<fdc78937>] xlog_get_bp+0x65/0x69 [xfs]
>
> Yeah - that trace implies a memory allocation failure when allocating
> log buffer pages and the cleanup looks like it does a double free
> of the pages that got allocated. Patch attached below that should fix
> this problem.
>
> > > [ 6667.271984] XFS: Filesystem loop1 has duplicate UUID - can't mount
> > >
> > > ...
> > >
> > > [ 6670.074487] XFS: Filesystem loop1 has duplicate UUID - can't mount
> > > [ 6670.240395] XFS: Filesystem loop1 has duplicate UUID - can't mount
> > > [ 6670.350305] XFS: Filesystem loop1 has duplicate UUID - can't mount
> > > [ 6670.458773] XFS: Filesystem loop1 has duplicate UUID - can't mount
>
> I assume that the thread doing the mount got killed by the BUG and so the
> normal error handling path on log mount failure was not executed and hence the
> uuid for the filesystem never got removed from the table used to detect
> multiple mounts of the same filesystem....
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> Principal Engineer
> SGI Australian Software Group
>
> ---
> fs/xfs/linux-2.6/xfs_buf.c | 21 +++++++++++++--------
> 1 file changed, 13 insertions(+), 8 deletions(-)
>
> Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_buf.c
> ===================================================================
> --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_buf.c 2007-05-11 16:03:26.000000000 +1000
> +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_buf.c 2007-05-17 11:53:40.293585132 +1000
> @@ -323,9 +323,16 @@ xfs_buf_free(
> for (i = 0; i < bp->b_page_count; i++) {
> struct page *page = bp->b_pages[i];
>
> - if (bp->b_flags & _XBF_PAGE_CACHE)
> + /* handle noaddr allocation failure case */
> + if (!page)
> + break;
> +
> + if (bp->b_flags & _XBF_PAGE_CACHE) {
> ASSERT(!PagePrivate(page));
> - page_cache_release(page);
> + page_cache_release(page);
> + } else {
> + __free_page(page);
> + }
> }
> _xfs_buf_free_pages(bp);
> }
> @@ -766,6 +773,8 @@ xfs_buf_get_noaddr(
> goto fail;
> _xfs_buf_initialize(bp, target, 0, len, 0);
>
> + bp->b_flags |= _XBF_PAGES;
> +
> error = _xfs_buf_get_pages(bp, page_count, 0);
> if (error)
> goto fail_free_buf;
> @@ -773,15 +782,14 @@ xfs_buf_get_noaddr(
> for (i = 0; i < page_count; i++) {
> bp->b_pages[i] = alloc_page(GFP_KERNEL);
> if (!bp->b_pages[i])
> - goto fail_free_mem;
> + goto fail_free_buf;
> }
> - bp->b_flags |= _XBF_PAGES;
>
> error = _xfs_buf_map_pages(bp, XBF_MAPPED);
> if (unlikely(error)) {
> printk(KERN_WARNING "%s: failed to map pages\n",
> __FUNCTION__);
> - goto fail_free_mem;
> + goto fail_free_buf;
> }
>
> xfs_buf_unlock(bp);
> @@ -789,9 +797,6 @@ xfs_buf_get_noaddr(
> XB_TRACE(bp, "no_daddr", len);
> return bp;
>
> - fail_free_mem:
> - while (--i >= 0)
> - __free_page(bp->b_pages[i]);
> fail_free_buf:
> xfs_buf_free(bp);
> fail:
>
---end quoted text---
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/