Re: Race condition in Kernel

From: Ming Lei
Date: Thu Apr 01 2021 - 22:39:47 EST


On Thu, Apr 01, 2021 at 04:27:37PM +0000, Gulam Mohamed wrote:
> Hi Ming,
>
> Thanks for taking a look into this. Can you please see my inline comments in below mail?
>
> Regards,
> Gulam Mohamed.
>
> -----Original Message-----
> From: Ming Lei <ming.lei@xxxxxxxxxx>
> Sent: Thursday, March 25, 2021 7:16 AM
> To: Gulam Mohamed <gulam.mohamed@xxxxxxxxxx>
> Cc: hch@xxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-block@xxxxxxxxxxxxxxx; Junxiao Bi <junxiao.bi@xxxxxxxxxx>; Martin Petersen <martin.petersen@xxxxxxxxxx>; axboe@xxxxxxxxx
> Subject: Re: Race condition in Kernel
>
> On Wed, Mar 24, 2021 at 12:37:03PM +0000, Gulam Mohamed wrote:
> > Hi All,
> >
> > We are facing a stale link (of the device) issue during the iscsi-logout process if we use parted command just before the iscsi logout. Here are the details:
> >
> > As part of iscsi logout, the partitions and the disk will be removed. The parted command, used to list the partitions, will open the disk in RW mode which results in systemd-udevd re-reading the partitions. This will trigger the rescan partitions which will also delete and re-add the partitions. So, both iscsi logout processing and the parted (through systemd-udevd) will be involved in add/delete of partitions. In our case, the following sequence of operations happened (the iscsi device is /dev/sdb with partition sdb1):
> >
> > 1. sdb1 was removed by PARTED
> > 2. kworker, as part of iscsi logout, couldn't remove sdb1 as it was already removed by PARTED
> > 3. sdb1 was added by parted
>
> After kworker is started for logout, I guess all IOs are supposed to be failed at that time, so just wondering why 'sdb1' is still added by parted(systemd-udev)?
> ioctl(BLKRRPART) needs to read partition table for adding back partitions, if IOs are failed by iscsi logout, I guess the issue can be avoided too?
>
> [GULAM]: Yes, the ioctl(BLKRRPART) reads the partition table for adding back the partitions. I kept a printk in the code just after the partition table is read. Noticed that the partition table was read before the iscsi-logout kworker started the logout processing.

OK, I guess I understood your issue now, what you want is to not allow
to add partitions since step 1, so can you remove disk just at the
beginning of 2) if it is possible? then step 1) isn't needed any more

For your issue, my patch of 'not drop partitions if partition table
isn't changed' can't fix your issue completely since new real partition
still may come from parted during the series.


Thanks,
Ming