Re: [RFC] dm-bow working prototype

From: MegaBrutal
Date: Thu Oct 25 2018 - 12:30:19 EST


Paul Lawrence <paullawrence@xxxxxxxxxx> ezt Ãrta (idÅpont: 2018. okt.
23., K, 23:27):
>
> bow == backup on write
>
> Similar to dm-snap, add the ability to take a snapshot of a device.
> Unlike dm-snap, a separate volume is not required.
>

The concept intrigued me, so I actually went on to try your prototype.
I could apply it on v4.12 mainline (newer kernel versions introduce
changes in "struct bio" in "include/linux/blk_types.h" those don't let
the module compile â I think minor changes would be necessary to adapt
to the new struct, though I didn't go into that).

My test scenario:
On a KVM, I created a 64M partition and formatted it to ext4, then put
some random files on it and unmounted the FS. I then called "dmsetup
create bowdev --table "0 131072 bow /dev/vdb1"". The
"/dev/mapper/bowdev" file appeared as expected. I mounted it in
read-only mode ("mount -vo ro /dev/mapper/bowdev /mnt") and run
"fstrim -v /mnt". At this point, I tried to advance to STATE 1 ("echo
1 > /sys/block/dm-2/bow/state"), but I got a kernel BUG alert. The
STATE did not change. I unmounted bowdev and removed the device
("dmsetup remove bowdev") which resulted in 2 subsequent kernel
alerts. The device disappeared but it brought the kernel to an
unstable state (various actions, like sync or trying to recreate the
bow device, resulted in a hang). I could not get any further than
this. I attached all the 3 kernel alerts in "dm-bow.dmesg.log".

I have some questions about dm-bow:
â How file system agnostic this feature is planned to be? While it is
designed with ext4 in mind, is it going to work when used over other
file systems, like FAT or BTRFS for example?
â Especially that BTRFS uses a CoW mechanism for even overwriting
files (overwritten segments are written to a free area and only then
gets the old data freed â except some specific conditions when
NO_COW/nodatacow is involved). Won't BTRFS CoW mechanism confuse BoW,
e.g. BTRFS will try to use space that BoW wants to use for backups?
Note however, using BoW on BTRFS wouldn't have much point, since BTRFS
has built-in features for snapshots. This leads me to my next
question.
â Why don't you just use BTRFS on Android? It basically provides a
similar feature like BoW, and it is matured enough, switching
snapshots are easy, etc.. However I see why it wouldn't be feasible
for you, e.g. it is slower than ext4, which would matter for an
Android device.
â What if you run out of free disk space while updating? I guess you
can just revert to the original state with BoW, but an update might
require more disk space with BoW (and this is a thing, my Android
always complains about not having enough space).
â Can I really expect dm-bow to work on non-Android systems (like I
tried it on an Ubuntu KVM)?
â Do you have any prototype for the command line utility to be used
for recovery?


MegaBrutal
[ 174.206735] atkbd serio0: Unknown key pressed (translated set 2, code 0x63 on isa0060/serio0).
[ 174.206745] atkbd serio0: Use 'setkeycodes 63 <keycode>' to make it known.
[ 174.215981] atkbd serio0: Unknown key released (translated set 2, code 0x63 on isa0060/serio0).
[ 174.215983] atkbd serio0: Use 'setkeycodes 63 <keycode>' to make it known.
[ 187.976875] EXT4-fs (dm-2): mounted filesystem with ordered data mode. Opts: (null)
[ 231.633634] device-mapper: bow: Switching to state Checkpoint
[ 231.639541] ------------[ cut here ]------------
[ 231.639543] kernel BUG at drivers/md/dm-bow.c:224!
[ 231.639990] invalid opcode: 0000 [#1] SMP
[ 231.640320] Modules linked in: dm_bow dm_bufio nls_iso8859_1 snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm joydev snd_timer snd ppdev input_leds soundcore serio_raw parport_pc parport qemu_fw_cfg mac_hid sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid qxl ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops psmouse virtio_blk ahci libahci floppy i2c_piix4 pata_acpi drm virtio_net
[ 231.642197] CPU: 1 PID: 1182 Comm: bash Not tainted 4.12.0-dmbow #109
[ 231.642496] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
[ 231.642860] task: ffff9f37568ad500 task.stack: ffffb925405c4000
[ 231.645551] RIP: 0010:sector_to_page.part.5+0x9/0x10 [dm_bow]
[ 231.645938] RSP: 0018:ffffb925405c7d20 EFLAGS: 00010202
[ 231.646229] RAX: ffff9f3756abaf00 RBX: 0000000000000000 RCX: 0000000000000000
[ 231.646489] RDX: 0000000000000000 RSI: 0000000000001242 RDI: ffff9f375e71bf00
[ 231.646750] RBP: ffffb925405c7d20 R08: 00000035eeca0000 R09: 0000000000000000
[ 231.647077] R10: 0000000000000000 R11: 0000000000000278 R12: ffff9f3756aba9c0
[ 231.647345] R13: 0000000000000000 R14: 0000000000000000 R15: ffff9f37568c5000
[ 231.647628] FS: 00007f9f1b8a0740(0000) GS:ffff9f375e700000(0000) knlGS:0000000000000000
[ 231.648077] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 231.648501] CR2: 00007f08c431af78 CR3: 00000000168ff000 CR4: 00000000000006e0
[ 231.648929] Call Trace:
[ 231.649417] copy_data+0x16f/0x1e0 [dm_bow]
[ 231.649794] state_store+0x216/0x320 [dm_bow]
[ 231.650127] ? do_wp_page+0x135/0x4e0
[ 231.650427] ? __handle_mm_fault+0x889/0xf70
[ 231.650704] kobj_attr_store+0xf/0x20
[ 231.651012] sysfs_kf_write+0x37/0x40
[ 231.651268] kernfs_fop_write+0x11c/0x1a0
[ 231.651527] __vfs_write+0x28/0x130
[ 231.651767] ? handle_mm_fault+0xd8/0x230
[ 231.652127] ? rw_verify_area+0x4e/0xb0
[ 231.652392] vfs_write+0xb1/0x1a0
[ 231.652648] SyS_write+0x46/0xa0
[ 231.652893] entry_SYSCALL_64_fastpath+0x1e/0xa9
[ 231.653111] RIP: 0033:0x7f9f1af75154
[ 231.653394] RSP: 002b:00007ffeccba8db8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 231.653692] RAX: ffffffffffffffda RBX: 00007f9f1b251760 RCX: 00007f9f1af75154
[ 231.654009] RDX: 0000000000000002 RSI: 000055e168555390 RDI: 0000000000000001
[ 231.654324] RBP: 0000000000000001 R08: 000000000000000a R09: 0000000000000001
[ 231.654618] R10: 000000000000000a R11: 0000000000000246 R12: 0000000000000001
[ 231.654929] R13: 000055e168448781 R14: 0000000000000000 R15: 0000000000000000
[ 231.655163] Code: 7e 67 ad f3 48 85 c0 75 0f eb 15 48 89 c7 e8 ff 68 ad f3 48 85 c0 74 08 83 78 20 01 75 ed 5d c3 0f 0b 66 66 66 66 90 55 48 89 e5 <0f> 0b 0f 1f 44 00 00 66 66 66 66 90 55 48 89 e5 41 57 41 56 41
[ 231.655619] RIP: sector_to_page.part.5+0x9/0x10 [dm_bow] RSP: ffffb925405c7d20
[ 231.655896] ---[ end trace 4a06f43bc2732b69 ]---
[ 352.245478] ------------[ cut here ]------------
[ 352.245914] WARNING: CPU: 0 PID: 1233 at drivers/md/dm-bufio.c:1513 dm_bufio_client_destroy+0x23d/0x250 [dm_bufio]
[ 352.246244] Modules linked in: dm_bow dm_bufio nls_iso8859_1 snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm joydev snd_timer snd ppdev input_leds soundcore serio_raw parport_pc parport qemu_fw_cfg mac_hid sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid qxl ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops psmouse virtio_blk ahci libahci floppy i2c_piix4 pata_acpi drm virtio_net
[ 352.247976] CPU: 0 PID: 1233 Comm: dmsetup Tainted: G D 4.12.0-dmbow #109
[ 352.248213] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
[ 352.248456] task: ffff9f3756af1540 task.stack: ffffb92540884000
[ 352.248713] RIP: 0010:dm_bufio_client_destroy+0x23d/0x250 [dm_bufio]
[ 352.248953] RSP: 0018:ffffb92540887be0 EFLAGS: 00010246
[ 352.249255] RAX: ffff9f375b2be618 RBX: ffff9f375673dc00 RCX: 0000000000000001
[ 352.249569] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9f375673dc00
[ 352.249878] RBP: ffffb92540887c10 R08: ffffffffc04ff430 R09: 0000000000000001
[ 352.250195] R10: ffffb92540887bb0 R11: 00000000000001be R12: ffff9f375673dc20
[ 352.250481] R13: ffff9f375692e000 R14: 0000000000000000 R15: ffff9f375673dc20
[ 352.250799] FS: 00007fda1d2d7040(0000) GS:ffff9f375e600000(0000) knlGS:0000000000000000
[ 352.251117] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 352.251408] CR2: 00007ffdc1142ff8 CR3: 0000000016fb3000 CR4: 00000000000006f0
[ 352.251699] Call Trace:
[ 352.252013] dm_bow_dtr+0x5f/0x90 [dm_bow]
[ 352.252320] dm_table_destroy+0x63/0x120
[ 352.252580] __dm_destroy+0x141/0x230
[ 352.252828] dm_destroy+0x13/0x20
[ 352.253092] dev_remove+0xde/0x120
[ 352.253425] ? remove_all+0x30/0x30
[ 352.253693] ctl_ioctl+0x1c2/0x4a0
[ 352.253919] dm_ctl_ioctl+0x13/0x20
[ 352.254153] do_vfs_ioctl+0x92/0x5d0
[ 352.254383] ? ____fput+0xe/0x10
[ 352.254630] ? task_work_run+0x7b/0x90
[ 352.254837] SyS_ioctl+0x79/0x90
[ 352.255034] entry_SYSCALL_64_fastpath+0x1e/0xa9
[ 352.255238] RIP: 0033:0x7fda1cb755d7
[ 352.255430] RSP: 002b:00007ffe974ee9b8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
[ 352.255630] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fda1cb755d7
[ 352.255828] RDX: 0000561e91675c90 RSI: 00000000c138fd04 RDI: 0000000000000003
[ 352.256026] RBP: 00007ffe974eea50 R08: 00007fda1ceac120 R09: 00007ffe974ee820
[ 352.256246] R10: 0000561e91675d40 R11: 0000000000000206 R12: 0000561e90d6b8e0
[ 352.256504] R13: 00007ffe974eed20 R14: 0000000000000000 R15: 0000000000000000
[ 352.256840] Code: 0f 84 70 ff ff ff 0f 0b 31 f6 48 c7 c7 78 f4 4f c0 e8 fa 11 8a f3 48 8b 53 48 48 85 d2 75 c4 48 83 7b 40 00 75 e0 e9 4b ff ff ff <0f> ff e9 7b ff ff ff 66 90 66 2e 0f 1f 84 00 00 00 00 00 66 66
[ 352.257464] ---[ end trace 4a06f43bc2732b6a ]---
[ 352.257929] device-mapper: bufio: leaked buffer 0, hold count 1, list 0
[ 352.258711] ------------[ cut here ]------------
[ 352.259094] kernel BUG at drivers/md/dm-bufio.c:1529!
[ 352.259384] invalid opcode: 0000 [#2] SMP
[ 352.259708] Modules linked in: dm_bow dm_bufio nls_iso8859_1 snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm joydev snd_timer snd ppdev input_leds soundcore serio_raw parport_pc parport qemu_fw_cfg mac_hid sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid qxl ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops psmouse virtio_blk ahci libahci floppy i2c_piix4 pata_acpi drm virtio_net
[ 352.261620] CPU: 0 PID: 1233 Comm: dmsetup Tainted: G D W 4.12.0-dmbow #109
[ 352.261964] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
[ 352.262285] task: ffff9f3756af1540 task.stack: ffffb92540884000
[ 352.262552] RIP: 0010:dm_bufio_client_destroy+0x1b3/0x250 [dm_bufio]
[ 352.262808] RSP: 0018:ffffb92540887be0 EFLAGS: 00010293
[ 352.263085] RAX: ffff9f375b2be618 RBX: ffff9f375673dc00 RCX: 0000000000000006
[ 352.263333] RDX: 0000000000000001 RSI: 0000000000000096 RDI: ffff9f375e60dcc0
[ 352.263554] RBP: ffffb92540887c10 R08: ffffffffc04ff430 R09: 00000000000002b3
[ 352.263773] R10: ffffb92540887bb0 R11: 00000000ffffffff R12: ffff9f375673dc40
[ 352.264018] R13: ffff9f375673dc08 R14: 0000000000000001 R15: ffff9f375673dc20
[ 352.264273] FS: 00007fda1d2d7040(0000) GS:ffff9f375e600000(0000) knlGS:0000000000000000
[ 352.264497] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 352.264734] CR2: 00007ffdc1142ff8 CR3: 0000000016fb3000 CR4: 00000000000006f0
[ 352.264998] Call Trace:
[ 352.265248] dm_bow_dtr+0x5f/0x90 [dm_bow]
[ 352.265520] dm_table_destroy+0x63/0x120
[ 352.265771] __dm_destroy+0x141/0x230
[ 352.266008] dm_destroy+0x13/0x20
[ 352.266321] dev_remove+0xde/0x120
[ 352.266657] ? remove_all+0x30/0x30
[ 352.267000] ctl_ioctl+0x1c2/0x4a0
[ 352.267344] dm_ctl_ioctl+0x13/0x20
[ 352.267700] do_vfs_ioctl+0x92/0x5d0
[ 352.268071] ? ____fput+0xe/0x10
[ 352.268375] ? task_work_run+0x7b/0x90
[ 352.268658] SyS_ioctl+0x79/0x90
[ 352.268901] entry_SYSCALL_64_fastpath+0x1e/0xa9
[ 352.269121] RIP: 0033:0x7fda1cb755d7
[ 352.269361] RSP: 002b:00007ffe974ee9b8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
[ 352.269599] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fda1cb755d7
[ 352.269820] RDX: 0000561e91675c90 RSI: 00000000c138fd04 RDI: 0000000000000003
[ 352.270038] RBP: 00007ffe974eea50 R08: 00007fda1ceac120 R09: 00007ffe974ee820
[ 352.270253] R10: 0000561e91675d40 R11: 0000000000000206 R12: 0000561e90d6b8e0
[ 352.270488] R13: 00007ffe974eed20 R14: 0000000000000000 R15: 0000000000000000
[ 352.270739] Code: 48 8b 7b 78 e8 cf b2 e0 f3 48 89 df e8 17 d5 90 f3 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d c3 41 be 01 00 00 00 e9 b4 fe ff ff <0f> 0b 0f 0b 0f 0b 0f 0b 84 d2 74 7e 4c 8d 68 e8 41 8b 55 40 49
[ 352.271217] RIP: dm_bufio_client_destroy+0x1b3/0x250 [dm_bufio] RSP: ffffb92540887be0
[ 352.271489] ---[ end trace 4a06f43bc2732b6b ]---