Re: in 2.6.23-rc3-git7 in do_cciss_intr

From: Randy Dunlap
Date: Thu Sep 25 2008 - 16:40:31 EST


On Thu, 25 Sep 2008 13:33:07 -0700 Randy Dunlap wrote:

> Jens Axboe wrote:
> > On Thu, Sep 04 2008, Miller, Mike (OS Dev) wrote:
> >>>>>> 0x3bb2 <do_cciss_intr+1649>: mov 0x2(%r8),%dx
> >>>>>> 0x3bb7 <do_cciss_intr+1654>: test %dx,%dx
> >>>>>> 0x3bba <do_cciss_intr+1657>: je 0x3f0e <do_cciss_intr+2509>
> >>>>>>
> >>>>>>
> >>>>>> $ addr2line -e cciss.o -f do_cciss_intr+0x627 SA5_fifo_full
> >>>>>>
> >>> /home/rdunlap/linsrc/linux-2.6.27-rc3-git7/drivers/block/cciss.h:206
> >>>>> OK ...that's confusing. It seems to be saying that ctrlr_info_t *
> >>>>> was NULL. However, I can't see a way of getting into the
> >>> fifo_full
> >>>>> callback from do_cciss_intr ..
> >>>>> especially not with an NULL host.
> >>>>>
> >>>>> James
> >>>> That is weird. Even if we could get there fifo_full doesn't
> >>> do anything but wait for a bit.
> >>>
> >>> Hi,
> >>>
> >>> This just happened again. This time it's on 2.6.27-rc5-git3.
> >>>
> >>> ~Randy
> >> Thanks Randy. I think. :)
> >>
> >> I'll try to recreate in my lab.
> >
> > This looks somewhat strange, mostly like 'c' is NULL and it's oopsing in
> > in removeQ (I don't think Randy's analysis is correct in assuming it's
> > 'h' and it's in fifo_full). Given that 'c' cannot be NULL, it's c->prev
> > or c->next that are NULL.
>
> Yes, correct IMO. I checked my daily test logs and I have had this problem
> in do_cciss_intr() 3 times, all at the same location, which appears to be
> in removeQ(), as Jens says.

Mike, also notice this: it's always during driver init, as indicated by
the (+) in the dump ('+' means that the module is in the process of being
loaded, but module load has not completed):

calling cciss_init+0x0/0x2e [cciss]
HP CISS Driver (v 3.6.20)
ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 54
cciss 0000:42:08.0: PCI INT A -> Link[LNKA] -> GSI 54 (level, high) -> IRQ 54
cciss0: <0x3238> at PCI 0000:42:08.0 IRQ 503 using DAC
BUG: unable to handle kernel NULL pointer dereference at 0000000000000248
IP: [<ffffffffa001bb68>] do_cciss_intr+0x627/0xa6c [cciss]
PGD 17e422067 PUD 17e423067 PMD 0
Oops: 0002 [1] SMP
CPU 2
Modules linked in: cciss(+) ehci_hcd ohci_hcd uhci_hcd
Pid: 0, comm: swapper Not tainted 2.6.27-rc3-git7 #1
RIP: 0010:[<ffffffffa001bb68>] [<ffffffffa001bb68>] do_cciss_intr+0x627/0xa6c [cciss]


---
~Randy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/