Why am I crashing my system?

Brain in Neutral bin at rhesus.primate.wisc.edu
Wed Nov 16 07:22:58 AEST 1988


System: either VAXstation w/Ultrix 2.2 or VAX 8200 w/Ultrix 1.2

I have an rsh-like program: opens socket connections to remote port, tells
remote machine to execute program, forks.  Child reads local input sends
to remote command.  Parent reads remote stdout and stderr.  Signals to
parent get sent to remote command on a socket as well.

I get the following weird behavior on occasion, apparently only IF
remote machine is same as local, and if the remote command does a lot
of fast writing back to local:  some of the data gets stuck in the
network (Send-Q for "remote" end has non-zero count in netstat output,
and the local parent never sees it.  It's waiting for it, because gcore
gives a dump that shows it's in a select call.  What I don't understand
is that the child (which is done writing to remote command and is ready
to exit) is shown as a zombie by ps, AND the ps flags for the child
include SSEL (=400000), which indicate that *it* is selecting!  How can
this be?  gcore on the child fails (gcore says "Zombie").

1) Why does ps show the child and not the parent as selecting when the child
is exiting and the parent is selecting?
2) Why does the parent select fail when there is actually something to
read (or why is the output stuck in the network?)

Now the ugly part of the above scenario.  The remote command has
finished, it's just that the local parent is hung up waiting to receive
the rest of the output.  Ok, fine, says I, I'll just ^C it.  That's
supposed to send a signal into the socket.  Of course, that socket goes
nowhere.  The system dies with a segmentation violation.  This does not
seem friendly to me.  Why does it occur?  Alternatively, how do I tell
that I'd better not write into that socket?

Paul DuBois
dubois at primate.wisc.edu



More information about the Comp.unix.wizards mailing list