1.3.72 bad news

Ben Wing (wing@666.com)
Thu, 14 Mar 1996 03:59:27 -0800


Earlier I had reported success (modulo some breakage that was fixable)
upon upgrading from 1.2.13 to 1.3.72. Unfortunately, I now have
two problems to report that are not obviously fixable and are definitely
kernel-related (dropping down to 1.2.13 with the same libraries etc.
makes them go away; no recompilation needed):

1) There appears to be a race condition in the PTY or socket code.
This problem is rather serious, I think. The idea here is that
I run `M-x grep foo *.c' from XEmacs, where `*.c' expands to
all of the source files for XEmacs (quite a lot of them have
`foo' in it). This spawns a subprocess (which runs the obvious
grep command), and captures its output using a PTY. The primary
event loop in XEmacs does a select() over a number of file
descriptors (including the PTY), and when output from the PTY
is available, it's read in and inserted in an Emacs buffer.

Now the problem I'm observing is that sometimes the end of
the output is lost. It is always an exact number of lines,
I think -- I've seen varying amounts of output (sometimes quite
a lot, i.e. more than 2400 characters worth) get lost, but
I've never seen partial lines. Furthermore, the problem is
definitely influenced by disk caching -- the first time I
do the grep, all the output gets seen, but each successive
time, more and more of the end of the output disappears.
Furthermore, if I have another process running that's doing
disk I/O, the problem often never appears -- further evidence
of cache relatedness.

As I said, dropping down to 1.2.13 makes the problem go away
without any recompilation of XEmacs (and it's statically
linked if that makes any difference).

Finally, I can duplicate the same problem with GNU Emacs 19.30
so it's not a bug in XEmacs. (It could conceivably be a
general Emacs problem, but this process code hasn't been touched
in a long time and works fine on other architectures.)

2) gdb 4.14 is unable to read core dumps generated under 1.3.72.
I reported this problem in an earlier message. Others have
reported the same problem, but no one has so far responded.
I imagine there's an easy answer to this but I don't know
of any.

I may try 1.3.57 and see if the problems are still there.

ben