Re: hpsa driver bug crack kernel down!

From: James Bottomley
Date: Wed Apr 09 2014 - 19:08:38 EST


[+linux-scsi]
On Wed, 2014-04-09 at 15:49 -0700, Davidlohr Bueso wrote:
> On Wed, 2014-04-09 at 10:39 +0800, Baoquan He wrote:
> > Hi,
> >
> > The kernel is 3.14.0+ which is pulled just now.
>
> Cc'ing more people.
>
> While the hpsa driver appears to be involved in some way, I'm sure if
> this is a related issue, but as of today's pull I'm getting another
> problem that causes my DL980 not to come up.
>
> *Massive* amounts of:
>
> DMAR:[fault reason 02] Present bit in context entry is clear
> dmar: DRHD: handling fault status reg 602
> dmar: DMAR:[DMA Read] Request device [02:00.0] fault addr 7f61e000
>
> Then:
>
> hpsa 0000:03:00.0: Controller lockup detected: 0xffff0000
> ...
> Workqueue: events hpsa_monitor_ctlr_worker [hpsa]
> ...
>
> Screenshot of the actual LOCKUP:
> http://stgolabs.net/hpsa-hard-lockup-3.14+.png
>
> While I haven't bisected, things worked fine until at least until commit
> 39de65aa2c3e (April 2nd).
>
> Any ideas?

Well, it's either a DMA remapping issue or a hpsa one. Your assertion
that everything worked fine until 39de65aa2c3e would tend to vindicate
hpsa, because all the hpsa changes went in before that under

Merge: 3e75c6d b2bff6c
Author: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Date: Tue Apr 1 18:49:04 2014 -0700

Merge tag 'scsi-misc' of
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

can you revalidate that this commit works OK just to make sure?

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/