Memory problem with SUN 4/330

Jim Williams SYSGROUP williams at nssdcs.gsfc.nasa.gov
Thu Aug 9 05:07:06 AEST 1990


In article <1990Aug7.003345.1862 at rice.edu> huebener at fbihh.informatik.uni-hamburg.de (Kai Huebener) writes:
-X-Sun-Spots-Digest: Volume 9, Issue 295, message 4
-
-One of our SUN4/330s is behaving strangely since two weeks: Every now and
-then we will get a message like
-
-vmunix: Parity Error: Physical Address ....
-vmunix: Memory Error in SIMM U801 on First 3U Memory Card:
-vmunix: panic: synchronous parity error - kernel
-
-and so on.
-
-The system will then either reboot or fall into the watchdog reset.
-

Well, I didn't have exactly that problem, but mine was similar enough that
I thought I'd pass it along.  We just upgraded a Sun 4/110 to a Sun 4/330,
and at the same time went from SunOS 4.0.3 to 4.1.  Note that "upgrade" in
this case is a euphemism; the only parts of the old system that remain are
the disks and the monitor!  (Alas, I had to give up my beloved type 3
keyboard for one of those wretched type 4s...)

Other hardware details: the system has 8MB of memory, a shoebox with SCSI
disk and tape, and a third party box with a Maxtor XT-8760E 550MByte scsi
disk.   The disk in the shoebox, sd0, is a 320MByte Micropolis 1558
connected to an Emulex MD21 controller.  The video is a cg6 color frame
buffer.

Anyway, I was unable to boot the system after reconnecting the disks.  Sun
was unable to explain why this should be, given that the kernel
architectures of the two machines are identical.  So, since I had to
reinstall the OS anyway, this seemed like a good time to go to SunOS 4.1.
(I would very much like to know why the old disk would not boot.  Sun
could never give me an adequate explaination.)

After I installed SunOS 4.1, I tried running X windows.  This is the X11
R4 server, straight from MIT.  It crashed the machine in a spectacular
fashion.  I got the error message below, which is much like the one quoted
above.

Parity Error: Physical Address 0xff1bc00 (Virtual Address 0xf74a0000)
Error Register 21cd0 <check,interna,intr>
Memory error somewhere in SIMMs U1284 through U1287 on CPU card.
panic: Anychronous parity error. -DVMA operation
Syncing file systems.

(This is not an exact, verbatim copy, but it is very close.  All the
numbers given are verbatim. "Asynchronous" really was spelled "Anychronous"..)

Our Sun service guy replaced first the CPU, then the CG6, but the problem
remained.  Starting X would *always* crash the machine with the above
message.  When crashed in this way, not even L1-A would be recognized!  I
had to *power cycle* the machine.  That's by far the worst I've ever seen
a user-mode, non-setuid program do.  The problem was solved by recompiling
the X11 server and some of the clients.  We've had no problems since.

Anyone understand all this?


Spoken: Jim Williams             Domain: williams at nssdcs.gsfc.nasa.gov
Phone: +1-301-555-1212           UUCP:   uunet!mimsy!williams
USPS: NASA/GSFC, Code 633, Greenbelt, MD 20771
Motto: There is no 'd' in "kluge"!  It rhymes with "deluge", not "sludge".



More information about the Comp.sys.sun mailing list