Re: Undo aic7xxx changes (now rc7+aic20030603)

From: Willy Tarreau (willy@w.ods.org)
Date: Tue Jun 24 2003 - 17:03:31 EST


On Tue, Jun 24, 2003 at 11:26:09PM +0200, Stephan von Krawczynski wrote:
 
> sorry, you probably misunderstood my flaky explanation. What I meant was not a
> cached block from the _tape_ (obviously indeed a char-type device) but from the
> 3ware disk (i.e. the other side of the verification). Consider the tape
> completely working, but the disk data corrupt (possibly not from real reading
> but from corrupted cache).

Ah, OK ! I didn't understand this. You're right, this is also a possibility.
Perhaps a tar cf - /mnt/3ware | chkblk would get evidence of somme corruption ?

<...snip... OK for these points ...>
 
> Hm, interestingly the former freeze bug (solved by marcelo through backout of
> some patch in rc8) did not show up in UP. Since then I did not test UP any
> more. The problem itself does not necessarily point to flaky hardware, as I
> would have no idea how bad cache can only show up during a tape verification,
> that does not sound all that reasonable.

OK, I agree. And right after posting, I remembered that if this was the case,
you should also see some MCEs which doesn't seem to be your case.

> More likely could be a SMP race anywhere from nfs-server, 3ware disk driver to
> page cache, or not?

fairly possible. That's also what Justin suggested in the past, BTW :-)

Cheers,
Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Jun 30 2003 - 22:00:17 EST