RE: megaraid_sas waiting for command and then offline

From: Patro, Sumant
Date: Wed Sep 06 2006 - 13:11:25 EST


Hello Brett,

A DMA related bug was fixed in FW ver *.0095 that was causing
the FW to stop responding.

Please upgrade the FW version to >= *.0095 and let me know if
you still see the issue.

Regards,

Sumant


-----Original Message-----
From: Brett G. Durrett [mailto:brett@xxxxxxxx]
Sent: Wednesday, September 06, 2006 9:04 AM
To: Dave Lloyd
Cc: Patro, Sumant; Bagalkote, Sreenivas; lkml; Berkley Shands
Subject: Re: megaraid_sas waiting for command and then offline


The machines are Dell 2900s, so the mobo is custom. From a Dell SE,
"Dell uses a custom mobo that is Dell branded with the Intel chipset
Greencreek.".

B-




Dave Lloyd wrote:

> Brett G. Durrett wrote:
> >
> > I have the same or a similar issue running 2.6.17 SMP x86_64 - the
> > megaraid_sas driver hangs waiting for commands and then the
filesystem
> > unmounts, leaving the machine in an unusable state until there is a
> hard
> > reboot (the machine is responsive but any access, shell or
> otherwise, is
> > impossible without the filesystem). While I do not have much
debugging
> > information available, this happens to me about once every 6-7 days
in
> > my pool of seven machines, so I can probably get debugging info.
Since
> > the disk is offline and I can't get remote console, I don't have any
> > details except something similar to Dave Lloyd's post, below.
> >
> > The only thing that the machines with these failures seem to have in
> > common is the fact that they are almost exclusively writes - they
are
> > slave database machines with large memory and pretty much just
> > replicate. The read/write machines seem to have less failures.
> >
> > I am happy to help provide debugging information in any reasonable
way.
> > In the mean time, if there are any known suggestions or workarounds
for
> > the problem, I would be grateful for the guidance.
> >
> > Here are what details on the controller. If you want additional
info,
> > let me know exactly what you need and I will do what I can to get it
to
> > you.:
> >
> > Product Name : PERC 5/i Integrated
> > Serial No : 12345
> > FW Package Build: 5.0.1-0030
> > FW Version : 1.00.01-0088
> > BIOS Version : MT23
> > Ctrl-R Version :1.02-007
> >
> > B-
>
> Which motherboard are you using? We believe that this may be a
> motherboard specific issue. It appears to happen on a SuperMicro
> motherboard but not a Tyan motherboard.
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/