Re: [syzbot] general protection fault in skb_dequeue (3)

From: David Howells
Date: Fri Feb 03 2023 - 09:37:33 EST


I think I have managed to isolate the bug to the read side of sendfile() or
the pipe in the middle by the following:

In iter_file_splice_write(), I allocate a permanent page:

+ mutex_lock(&splice_tmp_lock);
+ if (!splice_tmp) {
+ pr_notice("alloc splice_tmp\n");
+ splice_tmp = alloc_page(GFP_USER);
+ if (splice_tmp) {
+ SetPageDebugMark(splice_tmp);
+ page_ref_add(splice_tmp, 100);
+ }
+ }
+ mutex_unlock(&splice_tmp_lock);
+ if (!splice_tmp)
+ return -ENOMEM;
+

and then stick it into the BVEC iter to be handed over to vfs_iter_write()
instead of buf->page:

- array[n].bv_page = buf->page;
+ array[n].bv_page = splice_tmp;
+ trace_page_ref_set(splice_tmp, 887);
array[n].bv_len = this_len;
array[n].bv_offset = buf->offset;

that prevents vfs_iter_write() from ever seeing the pages from the pipe - but
the crash still happens even with this change.

One thing that does make me wonder is that I've made bio_release_pages()
always call __bio_release_pages() and made the latter dump all the pages in
the bio to the trace with val=999 - but some of the time, I don't see that
happening, so I'm wondering if there are some bio structs with pointers to
released pages floating around and not getting cleaned up.

I've added some event-specific tracepoints using the page_ref_set tracepoint,
using val=N as a key indicate the event:

val=888 -> Page seen in iter_file_splice_write() with PG_debug_mark set.
val=887 -> splice_tmp subbed for page
val=777 -> Page obtained from append_pipe() in iov_iter_extract_pipe_pages()
val=623 -> Page listed in bio_endio()
val=666 -> Page added by __bio_add_page()
val=98x -> xth page listed by __bio_release_pages()
val=-1 -> put_page()

Here's an excerpt from the trace when it looks right (note I removed
mapcount=0 from all lines):

page_ref_set: pfn=0x10b70b flags=debug_mark count=1 mapping=0000000000000000 mt=0 val=777
page_ref_set: pfn=0x10b70b flags=debug_mark count=1 mapping=0000000000000000 mt=0 val=666
page_ref_set: pfn=0x10b70b flags=debug_mark count=1 mapping=0000000000000000 mt=0 val=623
page_ref_set: pfn=0x10b70b flags=debug_mark count=1 mapping=0000000000000000 mt=0 val=980
page_ref_set: pfn=0x10b70b flags=debug_mark count=1 mapping=0000000000000000 mt=0 val=888
page_ref_set: pfn=0x10642f flags=debug_mark count=101 mapping=0000000000000000 mt=0 val=887
page_ref_mod_and_test: pfn=0x116209 flags=uptodate|dirty|debug_mark count=2 mapping=00000000d367da24 mt=1 val=-1 ret=0
page_ref_mod_and_test: pfn=0x10b70b flags=debug_mark count=0 mapping=0000000000000000 mt=0 val=-1 ret=1
page_ref_set: pfn=0x116209 flags=locked|uptodate|debug_mark count=3 mapping=00000000d367da24 mt=1 val=666
page_ref_mod_and_test: pfn=0x116209 flags=uptodate|lru|writeback|debug_mark count=2 mapping=00000000d367da24 mt=1 val=-1 ret=0
page_ref_mod_and_test: pfn=0x116209 flags=uptodate|lru|writeback|debug_mark count=1 mapping=00000000d367da24 mt=1 val=-1 ret=0
page_ref_set: pfn=0x116209 flags=waiters|uptodate|lru|writeback|debug_mark count=2 mapping=00000000d367da24 mt=1 val=623
page_ref_mod_and_test: pfn=0x116209 flags=uptodate|lru|debug_mark count=2 mapping=00000000d367da24 mt=1 val=-1 ret=0
page_ref_mod_and_test: pfn=0x116209 flags=uptodate|lru|debug_mark count=1 mapping=00000000d367da24 mt=1 val=-1 ret=0

pfn=0x10b70b is a page pulled out of the pipe, pfn=0x10642f is splice_tmp (has
count=101) and pfn=0x116209 is presumably the page the data is to be written
to.

And when it looks off:

page_ref_set: pfn=0x106fca flags=debug_mark count=1 mapping=0000000000000000 mt=0 val=777
page_ref_set: pfn=0x106fca flags=debug_mark count=1 mapping=0000000000000000 mt=0 val=666
page_ref_set: pfn=0x106fca flags=debug_mark count=1 mapping=0000000000000000 mt=0 val=888
page_ref_set: pfn=0x11e6bc flags=debug_mark count=101 mapping=0000000000000000 mt=0 val=887
page_ref_mod_and_test: pfn=0x11399a flags=uptodate|dirty|debug_mark count=2 mapping=000000003bd7df47 mt=1 val=-1 ret=0
page_ref_mod_and_test: pfn=0x106fca flags=debug_mark count=0 mapping=0000000000000000 mt=0 val=-1 ret=1
page_ref_set: pfn=0x11399a flags=locked|uptodate|debug_mark count=3 mapping=000000003bd7df47 mt=1 val=666
page_ref_mod_and_test: pfn=0x11399a flags=uptodate|lru|writeback|debug_mark count=2 mapping=000000003bd7df47 mt=1 val=-1 ret=0
page_ref_mod_and_test: pfn=0x11399a flags=uptodate|lru|writeback|debug_mark count=1 mapping=000000003bd7df47 mt=1 val=-1 ret=0
page_ref_mod_and_test: pfn=0x11399a flags=uptodate|lru|debug_mark count=1 mapping=000000003bd7df47 mt=1 val=-1 ret=0

pfn=0x106fca is the page from the pipe, pfn=0x11e6bc is splice_tmp and
pfn=0x11399a is the destination.

In the first trace, there's a val=980 line (page released from bio), but not
in the second trace.

David