Re: [patch] cciss: Fix race between disk-adding code and interrupt handler

From: Jens Axboe
Date: Mon Apr 14 2008 - 13:37:36 EST


On Mon, Apr 14 2008, scameron@xxxxxxxxxxxxxxxxxxxxxxx wrote:
>
>
> > On Mon, Apr 14 2008, scameron@xxxxxxxxxxxxxxxxxxxxxxx wrote:
> > >
> > >
> > > Fix race condition between cciss_init_one(), cciss_update_drive_info(),
> > > and cciss_check_queues(). cciss_softirq_done would try to start
> > > queues which were not quite ready to be started, as its checks for
> > > readiness were not sufficiently synchronized with the queue initializing
> > > code in cciss_init_one and cciss_update_drive_info. Slow cpu and
> > > large numbers of logical drives seem to make the race more likely
> > > to cause a problem.
> >
> > Hmm, this seems backwards to me. cciss_softirq_done() isn't going to
> > start the queues, until an irq has triggered for instance. Why isn't the
> > init properly ordered instead of band-aiding around this with a
> > 'queue_ready' variable?
> >
>
> Each call to add_disk() will trigger some interrupts,
> and earlier added disks may cause the queues of later,
> not-yet-completely added disks to be started.
>
> I suppose the init routine might be reorganized to initialize all
> the queues, then have second loop call add_disk() for all
> of them. Is that what you had in mind by "properly ordered?"

Yep precisely, don't call add_disk() until everything is set up.

> Disks may be added at run time though, and I think this tears
> down all but the first disk, and re-adds them all, if I remember
> right, so there is some complication there to think about.

Well, other drivers manage quite fine without resorting to work-arounds
:-)

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/