Re: [git patches] libata fixes

From: David Daney
Date: Fri Jan 16 2009 - 18:15:33 EST


David Daney wrote:
Andrew Morton wrote:
On Fri, 16 Jan 2009 10:27:21 -0500 Jeff Garzik <jeff@xxxxxxxxxx> wrote:

[...]

+static irqreturn_t octeon_cf_interrupt(int irq, void *dev_instance)
+{
+ struct ata_host *host = dev_instance;
+ struct octeon_cf_port *cf_port;
+ int i;
+ unsigned int handled = 0;
+ unsigned long flags;
+
+ spin_lock_irqsave(&host->lock, flags);

Would spin_lock() suffice here?

I have to think about that one.


The answer is an empirically determined No.

After switching to a spin_lock() as you suggested, I get:

BUG: spinlock recursion on CPU#0, pata_octeon_cf/700
lock: a80000041e8bd218, .magic: dead4ead, .owner: pata_octeon_cf/700, .owner_cpu: 0
Call Trace:
[<ffffffffc000c3ec>] dump_stack+0x8/0x34
[<ffffffffc01b9a14>] _raw_spin_lock+0xdc/0x1b0
[<ffffffffc0211680>] ata_scsi_queuecmd+0x40/0x2d8
[<ffffffffc01f4358>] scsi_dispatch_cmd+0x108/0x280
[<ffffffffc01fa778>] scsi_request_fn+0x3a0/0x4a0
[<ffffffffc019a2ec>] blk_invoke_request_fn+0xd4/0x1c0
[<ffffffffc019a9f8>] blk_run_queue+0x28/0x48
[<ffffffffc01f9b74>] scsi_run_queue+0xf4/0x398
[<ffffffffc01faafc>] scsi_next_command+0x3c/0x58
[<ffffffffc01fb7e4>] scsi_io_completion+0x344/0x520
[<ffffffffc019f8a8>] blk_done_softirq+0x98/0xb8
[<ffffffffc004dc18>] __do_softirq+0xd8/0x1f8
[<ffffffffc004ddc0>] do_softirq+0x88/0xa0
[<ffffffffc004e084>] irq_exit+0xac/0xd0
[<ffffffffc0011090>] plat_irq_dispatch+0x100/0x200
[<ffffffffc0000980>] ret_from_irq+0x0/0x4
[<ffffffffc019fa84>] __blk_complete_request+0x114/0x140
[<ffffffffc020f1a8>] ata_scsi_qc_complete+0x1b8/0x400
[<ffffffffc0219dec>] ata_sff_hsm_move+0x10c/0x860
[<ffffffffc021cb20>] octeon_cf_dma_finished+0x188/0x228
[<ffffffffc021ccd8>] octeon_cf_delayed_finish+0x118/0x138
[<ffffffffc005cf94>] run_workqueue+0xcc/0x1a8
[<ffffffffc005d448>] worker_thread+0x60/0xd0
[<ffffffffc0062058>] kthread+0x58/0xa8
[<ffffffffc001a020>] kernel_thread_helper+0x10/0x18

Apparently scsi_io_completion() is called from a softirq, so calls to ata_sff_hsm_move() must be done with interrupts disabled so that a softirq on the same CPU doesn't deadlock with itself.

I am sure you all will correct me if my interpretation of the trace is incorrect.

David Daney
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/