You won't get much fun out of that code on anything less than
a Pentium class machine, therefore the emulator argument is void.
I wouldn't call a 386 appropriate for serious signal processing...
I have at least _tried_ to save the FPU state and bring it into
a sane state when I used it. I don't exclude that there weren't
subtle problems, but I've looked at how the FPU memcpy patch did
it. So if my code was wrong, it's not unlikely that the FPU
memcpy patch is also wrong.
The problem with using integer arithmetic instead of floating point
is that one does not want to loose significant dynamic range, _and_
it should not take twice as long for the same computation.
(we're talking of in the order of 10000 cycles, so 200 cycles
for FPU save/init/restore do not matter much). Part of the problem
is GCC's deficiencies when it comes to long long, at least on the i386.
Another one is the f*** low number of registers of said architecture.
Guess why every serious DSP has at least 40 bits of accumulator.
Anyway I've tried again. The 2nd try has some ugly asm stuff
in there, but at least floating point is gone (hope that makes you
happy :-)) BTW that code is already a few weeks old - believe me
I don't feel comfortable at float in the kernel either.
The new version should be only slightly slower on Intel CPU's,
and is faster on eg. AMD. The asm stuff has a generic C counterpart,
so that there is at least a possibility that it might work on
other archs, but due to lack thereof here I cannot test myself.
Tom