Moving from 4.0.3c to 4.1.1 without doing an install
Charles Hedrick
hedrick at athos.rutgers.edu
Thu Mar 28 07:35:00 AEST 1991
A few people have asked whether it's really necessary to do a full install
to bring up 4.1.1. The answer is no, sort of. I thought I'd outline the
procedure we're following at Rutgers to move from 4.0.3c to 4.1.1.
Rutgers' situation is a bit unusual, because we use an automated software
distribution system to keep software up to date on several hundred Suns.
If at all possible, we try to avoid doing installs, because this requires
a staff member to take each system down and hack on it, whereas if we can
get our software distribution system to install things, it happens at 4am
without interfering with anybody. So we've come up with a plan to move
incrementally to 4.1.1 without doing an install.
First, we found that you can run a 4.1.1 kernel on a system that has
4.0.3c software, with a very few exceptions. Once you adjust a few pieces
of software, you will have a set of software that allows you to use either
a 4.0.3c or 4.1.1 kernel, simply by changing kernels (and /usr/kvm, if you
care about ps, etc.) Here's the minimum set of things we found we had to
adjust:
/usr/bin:
sh - also /sbin/sh. The 4.0.3c version of sh will not run
scripts under 4.1.1, which means that /etc/rc doesn't
run, etc. The 4.1.1 version of sh works fine under 4.0.3c,
so we just moved to that on all systems.
mt - "mt status" uses an ioctl that was changed incompatibly in
4.1.1. Most sites could probably live with a non-functional
mt status while they are doing the transition. It happens
that our backup scripts need it. We produced a version of
mt that tries the 4.1.1 method and backs up to the 4.0.3c
method if that fails. This is really a kernel bug. The
4.1.1 ioctl simply has a longer argument block. There's
no reason it couldn't accept the 4.0.3c size block as well
and just not fill in the extra information.
/usr/lib:
libc.so.* - In order to run software built under a 4.1.1 system,
and from the 4.1.1 distribution, we installed the 4.1.1
version of libc, including libc.so. They work fine under
4.0.3c. Note that the distributed version of libc does
not have encryption by default. If you use "des", etc.
make sure you get the additional encryption option.
ld.so ldconfig - these have moved from /usr/kvm to /usr/lib in
4.1.1. The 4.0.3c versions do not work under 4.1.1 on
all architectures. (I believe the problem was with
sun3 only.) The 4.1.1 works fine under 4.0.3c, so we
just moved to it everywhere.
/usr/etc:
in.telnetd and in.rlogind must be upgraded to the 4.1.1 version.
This is because of a slight change in the tty code,
which requires a setpgrp(0,0) in places where you could
get away with not having it before. The 4.1.1 version
works fine under 4.0.3c (though according to CERT you
should make sure to get a new version of in.telnetd that fixes a
security problem). [We have tried the 4.1.1 version of
in.telnetd and it does work. However the telnetd we are
actually using is from Berkeley. We had to make a couple
of patches to get it to work on both 4.0.3c and 4.1.1.
As Berkeley distributes it, you must decide at compile
time which release you are going to run it on. We want
the same image to work on both versions.]
ping - timeouts don't work if you use the 4.0.3c version
under 4.1.1. The 4.0.3c version depends upon
software interrupts interrupting a system call in
circumstances where it doesn't happen under 4.1.1.
The 4.1.1 version uses a new facility to explicitly
request that behavior. It works fine under both 4.0.3c
and 4.1.1.
/etc:
fstab - if you have your default swap partition listed, remove
it or comment it out. /etc/rc does swapon -a. This
will attempt to add all swap partitions listed in
/etc/fstab. In theory it's OK to list the default
swap partition. swapon -a should say "partition already
in use" and ignore it. A lot of sites put it in fstab
simply as documentation. Under 4.1.1 something obscure
happens that typically doesn't show up until you try
to back the system up or do something else that uses
a lot of memory. At that point the system may crash
with a fairly obscure panic. This bug was documented
by Columbia for 4.1. It appears to happen still under
4.1.1. At least we were seeing daily crashes, which
went away when we commented out that /etc/fstab entry
for our default swap partition. It's fine to remove
the entry for 4.0.3c systems as well. It was never
needed.
Of course in addition to this, you'll need to change /usr/kvm and the
programs that depend upon it, such as ps. However we can live without ps
for a few days. Thus we make just the changes described above to all
systems. This gets us into a position where we can bring up 4.1.1 just by
changing kernels (and doing MAKEDEV or mknod if the system has any non-Sun
devices whose major numbers have changed).
Once we are happy with the way 4.1.1 is running on a system, we change
/usr/kvm and related programs. Note that which programs are in /usr/kvm
has changed between 4.0.3c and 4.1.1. For the moment we've merged them.
I.e. anything that is in /usr/kvm in either version is in /usr/kvm for us,
so the symlinks are the same for 4.0.3c and 4.1.1. Then we just have to
exchange /usr/kvm to go between them.
4.1.1 has changed the way terminal I/O is done in init. The 4.0.3c init
will still work with 4.1.1, as long as you are using your old /etc/rc*.
(However there's some reason to think you may not be able to type ^C while
the system is booting to abort individual commands.) However eventually
you'll want to replace your /etc/rc, /etc/rc.local, etc., with the new
ones. They've reorganized them in a fairly nice way. When you change
/etc/rc* (or if you have a SS2 where they are preinstalled), you'll need
to move to the 4.1.1 version of init.
The 4.0.3c version of fsck and other file system utilities appear to work
fine under 4.1.1, as long as you keep your old file system. Eventually
you'll want to dump your files to tape, do a newfs, and bring them back.
When you do a newfs under 4.1.1 you get file systems in a new format which
will have better performance. By the time you do this, you'll need the
4.1.1 version of fsck, newfs, mkfs, etc. But you can put this off until
you're ready to commit to 4.1.1 permanently. 4.1.1 can handle old file
systems fine, and as long as you are using an old file system, the old
fsck will work.
One comment about the new fsck. It's got a handy option, -c, for
converting between 4.0.3c and 4.1.1 file system formats. (However as the
installation manual explains, it is not always possible to go back from
the new to the old format, depending upon the file system parameters.) We
found one unexpected thing about fsck -c. I'm accustomed to having fsck
scan the disk, but not actually do anything until the end (or when it
finds an error). fsck -c changes the superblock immediately, but changes
the free list at the end. So if you ^C in the middle of the operation,
you get a file system that is very confused. It will work, but you tend
to get crashes. Running fsck -c again will unconfuse it, fortunately. It
is safe to run fsck -c just to see whether the disk is in new or old
format. It starts by asking you whether you want to convert it, and the
way it asks the question tells you what the current format is. (If it
asks whether you want to convert to the new format, you know it's
currently in the old format.) As long as you ^C when it asks that
question, it hasn't made any changes. But once you tell it to go ahead,
it changes the superblock.
More information about the Comp.sys.sun
mailing list