Re: 2.0.30 serial.c, ppp.c and pppd-2.2 questions

Rob Riggs (rriggs@tesser.com)
Wed, 23 Jul 1997 22:53:04 -0600 (MDT)


On 24-Jul-97 Theodore Y. Ts'o wrote:
> Date: Thu, 24 Jul 1997 00:06:25 +0200
> From: Andries.Brouwer@cwi.nl
>
> Possibly entirely unrelated (you are talking about missed timer interrupts,
> probably I am talking about missed serial line interrupts) but
> my SLIP connection is completely unusable when something disk-intensive
> is running. FTP gets into an exponential backoff and seems to hang
> completely, but recovers some time after the make/find/whatever has
> finished.
>
> [This happens both with IDE and SCSI activity. I never really
> investigated.]
>
>At least for the IDE case, this is the very well known problem of the
>IDE driver disabling interrupts to prevent data corruption in a few
>badl;y designed IDE controllers. If you don't have the bad IDE
>controllers (see the man page for more details), you will likely be able
>to use "hdarm -u 1" which will fix the problem without causing your
>disks to get massively corruption.
>
>As far as the SCSI activity, my guess it is a similar problem, but the
>solution to solve it is very dependent on the SCSI manufacturer.

Still, it would be interesting to see if the changes I have
made to serial.c help any of these people that rely on
'hdparm -u 1' and 'irqtune'. I originally assumed that I
was missing serial line interrupts (rather, just responding
too late) and suffering FIFO overruns. While the FIFO overruns
did occur on occassion, the flip buffer overruns were at the
root of my problem.

The symptom for FIFO overflows and flip buffer overflows
on a PPP connection is the same: dropped frames. If kdebug
is set in pppd, they both cause the kernel to report the
same error, "ppp: frame with bad fcs". Unless you modify
the kernel to report on both conditions, you cannot tell
which problem is culprit.

I know that Ted finds it hard to believe that it could take
longer than 40ms to do all of the bottom half processing, but
I find it even harder to believe that I am the only one
experiencing the problem.

Anyone wishing to try the code changes can find patches at:

http://www.DevilsThumb.COM/~rob

The patch is against 2.0.31-2, but should apply to any
recent 2.0 source tree. Most of the code has had about a
week's worth of testing on 2 different machines, so the
fundamentals seem to be OK. I did clean up the code a
bit to make it presentable. I may have broken a thing
or two in the process.

The vast majority of the patch is for support of the newer
StarTech and TI UARTs. The flip buffer flow control will
work with standard 16550A UARTs. It does require a modified
'setserial' to take advantage of the new features. The
setserial patch is available at the above URL as well.
Anyone still experiencing FIFO overruns may try tuning
the FIFO trigger levels, which the new code also allows.

The setserial patch is against setserial-2.12.

Enjoy

-Rob