Re: [PATCH v2 1/2] md: Fix overflow in is_mddev_idle

From: Song Liu
Date: Fri Dec 15 2023 - 18:13:45 EST


On Thu, Dec 14, 2023 at 5:41 PM <linan666@xxxxxxxxxxxxxxx> wrote:
>
> From: Li Nan <linan122@xxxxxxxxxx>
>
> UBSAN reports this problem:
>
> UBSAN: Undefined behaviour in drivers/md/md.c:8175:15
> signed integer overflow:
> -2147483291 - 2072033152 cannot be represented in type 'int'
> Call trace:
> dump_backtrace+0x0/0x310
> show_stack+0x28/0x38
> dump_stack+0xec/0x15c
> ubsan_epilogue+0x18/0x84
> handle_overflow+0x14c/0x19c
> __ubsan_handle_sub_overflow+0x34/0x44
> is_mddev_idle+0x338/0x3d8
> md_do_sync+0x1bb8/0x1cf8
> md_thread+0x220/0x288
> kthread+0x1d8/0x1e0
> ret_from_fork+0x10/0x18
>
> 'curr_events' will overflow when stat accum or 'sync_io' is greater than
> INT_MAX.
>
> Fix it by changing sync_io, last_events and curr_events to 64bit.
>
> Signed-off-by: Li Nan <linan122@xxxxxxxxxx>
> ---
> drivers/md/md.h | 4 ++--
> include/linux/blkdev.h | 2 +-
> drivers/md/md.c | 7 ++++---
> 3 files changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/md/md.h b/drivers/md/md.h
> index ade83af123a2..1a4f976951c1 100644
> --- a/drivers/md/md.h
> +++ b/drivers/md/md.h
> @@ -50,7 +50,7 @@ struct md_rdev {
>
> sector_t sectors; /* Device size (in 512bytes sectors) */
> struct mddev *mddev; /* RAID array if running */
> - int last_events; /* IO event timestamp */
> + long long last_events; /* IO event timestamp */
>
> /*
> * If meta_bdev is non-NULL, it means that a separate device is
> @@ -584,7 +584,7 @@ extern void mddev_unlock(struct mddev *mddev);
>
> static inline void md_sync_acct(struct block_device *bdev, unsigned long nr_sectors)
> {
> - atomic_add(nr_sectors, &bdev->bd_disk->sync_io);
> + atomic64_add(nr_sectors, &bdev->bd_disk->sync_io);
> }
>
> static inline void md_sync_acct_bio(struct bio *bio, unsigned long nr_sectors)
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 3f8a21cd9233..d28b98adf457 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -170,7 +170,7 @@ struct gendisk {
> struct list_head slave_bdevs;
> #endif
> struct timer_rand_state *random;
> - atomic_t sync_io; /* RAID */
> + atomic64_t sync_io; /* RAID */
> struct disk_events *ev;

As we are on this, I wonder whether we really need this.
AFAICT, is_mddev_idle() is the only consumer of sync_io.
We can probably do the same check in is_mddev_idle()
without sync_io.

Thanks,
Song


>
> #ifdef CONFIG_BLK_DEV_ZONED
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index c94373d64f2c..1d71b2a9af03 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -8496,14 +8496,15 @@ static int is_mddev_idle(struct mddev *mddev, int init)
> {
> struct md_rdev *rdev;
> int idle;
> - int curr_events;
> + long long curr_events;
>
> idle = 1;
> rcu_read_lock();
> rdev_for_each_rcu(rdev, mddev) {
> struct gendisk *disk = rdev->bdev->bd_disk;
> - curr_events = (int)part_stat_read_accum(disk->part0, sectors) -
> - atomic_read(&disk->sync_io);
> + curr_events =
> + (long long)part_stat_read_accum(disk->part0, sectors) -
> + atomic64_read(&disk->sync_io);
> /* sync IO will cause sync_io to increase before the disk_stats
> * as sync_io is counted when a request starts, and
> * disk_stats is counted when it completes.
> --
> 2.39.2
>
>