On Tue, 10 Jun 2003 14:15:58 -0400 (EDT)
Zwane Mwaikambo <firstname.lastname@example.org> wrote:
> On Tue, 10 Jun 2003, Stephan von Krawczynski wrote:
> > The controller used is the second aic7xxx. The 31 interrupts on CPU0 have
> > occured before the test. This setup fails during verify (data corruption).
> > I would say that the interrupt code of the aic in itself is therefore ok
> > with SMP. If it were a SMP race condition inside the interrupt routine this
> > test should have been ok (as only one CPU is used).
> Thanks for verifying this, at least i know the problem isn't with
> interrupt routing in your specific case.
I guess your comment is a bit ahead of my tests. I just completed the test with
rc7+aic20030603 SMP, apic and maxcpus=1. It fails.
This means that although there is only one CPU used through the whole kernel
the data corruption occurs.
I would therefore conclude that the corruption is only possible if in fact the
standard code path is flaky in terms of data completeness per request.
Something like a broken synchronous action, a read request coming back
completed although it is in fact still running or the like.
May also be a misinterpretation of a kind of an "action completed" interrupt.
Or something like one interrupt for multiple running actions with a mixup of
the various causes.
To make sure it is not a problem in the SMP code path through the driver I have
to check a UP kernel with apic support enabled. I will do this tommorrow.
If this is ok then things are simple, because its nailed down to the SMP code
path without a concurrency cause.
Lets see ...
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to email@example.com
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Sun Jun 15 2003 - 22:00:26 EST