PROBLEM: BLKPG_DEL_PARTITION with GENHD_FL_NO_PART used to return ENXIO, now returns EINVAL

From: Allison Karlitskaya
Date: Mon Jan 15 2024 - 07:14:15 EST


hi,

[1.] One line summary of the problem:
BLKPG_DEL_PARTITION on an empty loopback device used to return ENXIO
but now returns EINVAL, breaking partprobe

[2.] Full description of the problem/report:
We recently caught this problem in our CI for Cockpit:
https://github.com/cockpit-project/bots/pull/5793

The summary is that if you do something like this:

$ dd if=/dev/zero of=/tmp/foo bs=1M count=50
$ partprobe $(losetup --find --show /tmp/foo)

Then this will fail with the following error message:

Error: Partition(s) 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32,
33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49,
50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64 on
/dev/loop2 have been written, but we have been unable to inform the
kernel of the change, probably because it/they are in use. As a
result, the old partition(s) will remain in use. You should reboot
now before making further changes.

.. when it used to be successful. That's down to this syscall
(called by partprobe) changing its behaviour between kernel versions:

-ioctl(3, BLKPG, {op=BLKPG_DEL_PARTITION, flags=0, datalen=152,
data={start=0, length=0, pno=1, devname="", volname=""}}) = -1 ENXIO
(No such device or address)
+ioctl(3, BLKPG, {op=BLKPG_DEL_PARTITION, flags=0, datalen=152,
data={start=0, length=0, pno=1, devname="", volname=""}}) = -1 EINVAL
(Invalid argument)

This is observed on Ubuntu jammy with partprobe from parted
3.4-2build1. I've confirmed that the original parted-3.4 download
from https://ftp.gnu.org/gnu/parted/ is also impacted in the same way.

[3.] Keywords:
block, partition, BLKPG_DEL_PARTITION, loop device, EINVAL, ENXIO

[4.] Kernel information:
Linux ubuntu 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC
2024 x86_64 x86_64 x86_64 GNU/Linux

This is the version currently in jammy-proposed. The likely culprit
is this commit:

https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/jammy/commit/?id=49a502554e8aa853a0357f287121d4cdf4442a58

which is also upstream as 1a721de8489fa559ff4471f73c58bb74ac5580d3.

There has been discussion on linux-kernel before about this:
https://marc.info/?l=linux-kernel&m=169753467305218&w=2

but now we have a pretty clear case of "breaks userspace in the wild".

[4.1.] Kernel version (from /proc/version):

Linux version 5.15.0-94-generic (buildd@lcy02-amd64-096) (gcc (Ubuntu
11.4.0-1ubuntu1~22.04) 11.4.0, GNU ld (GNU Binutils for Ubuntu) 2.38)
#104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024

[4.2.] Kernel .config file:

I pasted a copy here:

https://paste.centos.org/view/8d6506bc

but it won't be around for more than 24 hours. It's just the config
file present in /boot on the affected install.

[5.] Most recent kernel version which did not have the bug:

We last tested 5.15.0-91-generic and found it to be working with the
previous behaviour (ie: returning ENXIO).

[7.] A small shell script or example program which triggers the
problem (if possible)

as above:

$ dd if=/dev/zero of=/tmp/foo bs=1M count=50
$ partprobe $(losetup --find --show /tmp/foo)

[8.] Environment
[8.1.] Software (add the output of the ver_linux script here)
[8.2.] Processor information (from /proc/cpuinfo):
[8.3.] Module information (from /proc/modules):
[8.4.] Loaded driver and hardware information (/proc/ioports, /proc/iomem)
[8.5.] PCI information ('lspci -vvv' as root)
[8.6.] SCSI information (from /proc/scsi/scsi)
[8.7.] Other information that might be relevant to the problem
(please look in /proc and include all information that you
think to be relevant):
[X.] Other notes, patches, fixes, workarounds:


I don't expect there would be anything relevant here, but feel free to
ask. It's a qemu x86_64 VM image running on my Intel laptop. If you
want to test this, check out

https://github.com/cockpit-project/bots/tree/image-refresh-ubuntu-2204-20240114-225118

and run

./vm-run -q ubuntu-2204

at which point you should be presented with instructions about how to
ssh to the machine.

Thanks for the attention!

Allison Karlitskaya