Re: [PATCH 2/2] btrfs: prevent copying too big compressed lzo segment

From: Dāvis Mosāns
Date: Thu Feb 03 2022 - 11:04:15 EST


ceturtd., 2022. g. 3. febr., plkst. 15:33 — lietotājs Su Yue
(<l@xxxxxxxxxx>) rakstīja:
>
>
> On Wed 02 Feb 2022 at 23:44, Dāvis Mosāns <davispuh@xxxxxxxxx>
> wrote:
>
> > Compressed length can be corrupted to be a lot larger than
> > memory
> > we have allocated for buffer.
> > This will cause memcpy in copy_compressed_segment to write
> > outside
> > of allocated memory.
> >
> > This mostly results in stuck read syscall but sometimes when
> > using
> > btrfs send can get #GP
> >
> > kernel: general protection fault, probably for non-canonical
> > address 0x841551d5c1000: 0000 [#1] PREEMPT SMP NOPTI
> > kernel: CPU: 17 PID: 264 Comm: kworker/u256:7 Tainted: P
> > OE 5.17.0-rc2-1 #12
> > kernel: Workqueue: btrfs-endio btrfs_work_helper [btrfs]
> > kernel: RIP: 0010:lzo_decompress_bio
> > (./include/linux/fortify-string.h:225 fs/btrfs/lzo.c:322
> > fs/btrfs/lzo.c:394) btrfs
> > Code starting with the faulting instruction
> > ===========================================
> > 0:* 48 8b 06 mov (%rsi),%rax
> > <-- trapping instruction
> > 3: 48 8d 79 08 lea 0x8(%rcx),%rdi
> > 7: 48 83 e7 f8 and $0xfffffffffffffff8,%rdi
> > b: 48 89 01 mov %rax,(%rcx)
> > e: 44 89 f0 mov %r14d,%eax
> > 11: 48 8b 54 06 f8 mov -0x8(%rsi,%rax,1),%rdx
> > kernel: RSP: 0018:ffffb110812efd50 EFLAGS: 00010212
> > kernel: RAX: 0000000000001000 RBX: 000000009ca264c8 RCX:
> > ffff98996e6d8ff8
> > kernel: RDX: 0000000000000064 RSI: 000841551d5c1000 RDI:
> > ffffffff9500435d
> > kernel: RBP: ffff989a3be856c0 R08: 0000000000000000 R09:
> > 0000000000000000
> > kernel: R10: 0000000000000000 R11: 0000000000001000 R12:
> > ffff98996e6d8000
> > kernel: R13: 0000000000000008 R14: 0000000000001000 R15:
> > 000841551d5c1000
> > kernel: FS: 0000000000000000(0000) GS:ffff98a09d640000(0000)
> > knlGS:0000000000000000
> > kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > kernel: CR2: 00001e9f984d9ea8 CR3: 000000014971a000 CR4:
> > 00000000003506e0
> > kernel: Call Trace:
> > kernel: <TASK>
> > kernel: end_compressed_bio_read (fs/btrfs/compression.c:104
> > fs/btrfs/compression.c:1363 fs/btrfs/compression.c:323) btrfs
> > kernel: end_workqueue_fn (fs/btrfs/disk-io.c:1923) btrfs
> > kernel: btrfs_work_helper (fs/btrfs/async-thread.c:326) btrfs
> > kernel: process_one_work (./arch/x86/include/asm/jump_label.h:27
> > ./include/linux/jump_label.h:212
> > ./include/trace/events/workqueue.h:108 kernel/workqueue.c:2312)
> > kernel: worker_thread (./include/linux/list.h:292
> > kernel/workqueue.c:2455)
> > kernel: ? process_one_work (kernel/workqueue.c:2397)
> > kernel: kthread (kernel/kthread.c:377)
> > kernel: ? kthread_complete_and_exit (kernel/kthread.c:332)
> > kernel: ret_from_fork (arch/x86/entry/entry_64.S:301)
> > kernel: </TASK>
> >
> > Signed-off-by: Dāvis Mosāns <davispuh@xxxxxxxxx>
> > ---
> > fs/btrfs/lzo.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/fs/btrfs/lzo.c b/fs/btrfs/lzo.c
> > index 31319dfcc9fb..ebaa5083f2ae 100644
> > --- a/fs/btrfs/lzo.c
> > +++ b/fs/btrfs/lzo.c
> > @@ -383,6 +383,13 @@ int lzo_decompress_bio(struct list_head
> > *ws, struct compressed_bio *cb)
> > kunmap(cur_page);
> > cur_in += LZO_LEN;
> >
> > + if (seg_len > WORKSPACE_CBUF_LENGTH) {
> > + // seg_len shouldn't be larger than we
> > have allocated for workspace->cbuf
> >
> Makes sense.
> Is the corrupted lzo compressed extent produced by a normal fs or
> crafted manually? If it is from a normal fs, something insane
> happened
> in extent compressed path.
>

Happened normally, but in 2016 year. It's RAID1 where HBA dropped out
some disks and some sectors didn't got written, so most likely that
section contains previous unrelated data.