While working on the latest episode of my saga about moving files
between v6 and v7, I noticed that the sum utility from v6 reports a
different checksum than it does using the sum utility from v7 for the
same file. To confirm, I did the following on both systems:
# echo "Hello, World" > hi.txt
# cat hi.txt
Then on v6:
# sum hi.txt
But on v7:
# sum hi.txt
There is no man page for the utility on v6, and it's assembler. On v7,
there's a manpage and it's C:
Sum calculates and prints a 16-bit checksum for the named
file, and also prints the number of blocks in the file.
A few questions:
1. I'll eventually be able to read assembly and learn what the v6
utility is doing the hard way, but does anyone know what's going on here?
2. Why is sum reporting different checksum's between v6 and v7?
3. Do you know of an alternative to check that the bytes were
transferred exactly? I used od and then compared the text representation
of the bytes on the host using diff (other than differences in output
between v6 and v7 related to duplicate lines, it worked ok but is clunky).
In my exploration of v6, I followed the advice in "Setting up Unix -
Seventh Edition" and copied v6tar from v7 to v6. Life is good. However,
tar is using mt1 and it is hard coded into the source, tar.c:
char magtape = "/dev/mt1";
As the subject line suggested, I have two questions for those of you who
1. Why is it hard coded?
2. Why is it the second device and not the first?
Interestingly, it took me a little while to figure out it was doing this
because I didn't actually move files between v6 and v7 until today.
Before this my tests had been limited to separate tests on v6 and v7
along the lines of:
tar c .
list of files
files extracted and matching
What it was doing was writing to the non-existant /dev/mt1, which it
then created, tarring up stuff, and exiting. Then when I listed the
contents of the tarfile, or extracted the contents, it was successful.
But, when I went to move the tape between v6 and v7, the tape (mt0) was
blank, of course. It was at this point that I followed Noel's advice
and "Used the source", and figured out that it was hard-coded as you see
That's exactly right. ld performs the same task as LOAD did on BESYS,
except it builds the result in the file system rather than user
space. Over time it became clear that "linker" would be a better
term, but that didn't warrant canning the old name. Gresham's law
then came into play and saddled us with the ponderous and
misleading term, "link editor".
> My understanding, which predates my contact with Unix, is that the
> original toochains for single-job machines consisted of the assembler
> or compiler, the output of which was loaded directly into core with
> the loader. As things became more complicated (and slow), it made
> sense to store the memory image somewhere on drum, and then load that
> image directly when you wanted to run it. And that in some systems
> the name "loader" stuck, even though it no longer loaded. Something
> like the modern ISP use of the term "modem" to mean "router". But I
> don't have anything to back up this version; comments welcome.
> estabur (who thought these names up, I know 8 characters is limiting,
> but c'mon)
'establish user mode registers'
> the 411 header is read by a loader
Actually, it's read by the exec() system call (in sys1.c).
> From: Dave Horsfall
> I love those PDP-11 instructions, such as "blos" and "sob" :-)
Yes, but alas, there is no 'jump [on] no carry' instruction! (Yes, yes, I
know about BCC! :-) Although I guess the x86 has one...
> Yes the V6 kernel runs in split I and D mode, but it doesn't end up
> supporting any more data. I.e. the kernel is still a 407 (or 410) file.
> _etext/_edata/_end are still referencing the same 64K space.
Err, actually, not really.
The thing is that to build the split-I/D kernel, one sets the linker to
produce an output file which still contains the relocation bits. That is then
post-processed by 'sysfix', which does wierd magic (moves the text up to
020000, in terms of address space; and puts the data _below_ the text, in the
actual output file). So while the files concerned may have a '407' in their
header, they definitely aren't what one normally finds in a linked 407 or 410
In particular, data addresses start at 0, and can run up to 0140000 (i.e. up
to 56KB), while text addreses start at 020000 and can run up to 0160000. So,
_etext/_edata/_end are not, in fact, in the same 64K space. And the total of
data (initialized and un-initialized) together with the text can be much
larger than 64KB - up to 112KB (modor so.
J.F. Ossanna (jfo) was born in 1928; he helped give us Unix, and developed
the ROFF series (which I still use).
And Ada Lovelace, the world's first computer programmer, was coded in 1815.
Dave Horsfall DTM (VK2KFU) "Those who don't understand security will suffer."
> From: Ronald Natalie
> I'm pretty sure the V6 kernel didn't run in split I/D.
Nope. From 'SETTING UP UNIX - Sixth Edition':
"Another difference is that in 11/45 and 11/70 systems the instruction and
data spaces are separated inside UNIX itself."
And if you don't believe that, check out:
the source! ;-)
> It wasn't too involved of a change to make a split I/D kernel.
> Mike Muuss and his crew at JHU did it.
Maybe you're remembering the process on a pre-V6 system?
> We spent more time getting the bootstrap to work than anything else I
It's possible you're remembering that, as distributed, V6 didn't support load
images where the text+initialized-data was larger than 24KW-delta; it would
have been pretty eaay to up that to 28KW-delta (change a parameter in the
bootstrap source, and re-assemble), but after that, the V6 bootstrap would
have had to have been extensively re-worked.
And there were _also_ a variety of issues with handling maximal large images
in the startup code. Once operating, the kernel has segments KI1-KI7 available
the hold the system's code; however, it's not clear that all of KI1-7 are
really usable, since the system can't 'see' enough code while in the code
relocation phase in the startup to fill them all. E.g. during code relocation,
KI7 is ripped off to hold a pointer to I/O space (since KD7 is set to point to
low memory just after the memory that KD6 points to).
These might have been issues in systems which were ARPANET-connected (i.e.
ran NCP), as that added a very large amount of code to the kernel.