> From: Phil Budne
> BUT, the basic TCP and IP protocols seem to have been created with a
> general care that two byte fields should be aligned at multiples of two
> bytes
Yes, because dealing with a 16-bit field that spans two PDP-11 16-bit words
is a pain (espcially because the PDP-11 does not have a 'load byte into
register _without_ extending the sign bit into the high half' instruction).
Do realize that in addition to the early TCP implementation, the _first_ TCP
router (at that stage, TCP and IP were not separate protocols) was also a
PDP-11 (albeit programmed in BCPL, not MACRO-11).
I remember the extension being a real PITA. To load an un-aligned 16-bit
quantity into R0, one would have had to do something like (assuming a pointer
to the un-aligned 16-bit quantity was in R1):
MOVB (R1)+, R0
SWAB R0
BIC #0377, R0
BISB (R1)+, R0
There may have been a better way to do it, but that's the best I can come up
with now; I recall we had to do something like that.
Yes, the 16-bit fields were 16-bit word aligned.
Noel
the code in the repo is for the FPGA, the processor that is strapped to the
FPGA well it runs the real code.
It's like the 'minimig' Amiga emulator platform, a real processor, and FPGA
to do all the IO heavy lifting.
So it's not 100% FPGA but you are executing code on a real processor so you
aren't exactly full emulation either. And it doesn't cost a fortune,
assuming you can find one of these ancient microprocessors.
-----Original Message-----
From: emanuel stiebler
To: Jason Stevens; 'tuhs(a)tuhs.org'
Sent: 3/5/25 2:50 PM
Subject: Re: [TUHS] DCJ-11 processor with 20k FPGA
On 2025-03-01 07:11, Jason Stevens via TUHS wrote:
> I assume people have seen this?
>
> https://github.com/ryomuk/TangNanoDCJ11MEM/tree/main
>
>
> It's capable of running Unix v1 & some limited amount of v6 among
other
> things. The FPGA in question the Tang Nano 20k is sub 30GBP
delivered from
> AliExpress.
>
> Kind of neat to combine a real processor with a simple FPGA
implementation
> of the hardware.
I just had a look at it, but he doesn't show the code, which runs on the
TangNano?
> From: Rob Pike
> The notion that the struct layout must correspond to the hardware
> device's layout is about as non-portable as thinking can get.
I'm confused; I thought device register layout is inherently about as
non-portable a thing as one could have, generally.
(Exceptions: 1) the device is basically a single chip, so interfaces on two
machines might be essentially identical, if they use the same chip; 2) someone
made a 68K card that plugged into a QBUS, so drivers on a PDP-11 and that 68K
could be identical.)
Or did you mean that one could somehow disassociate the struct layout and the
details of the device (assuming it has addressable registers, as became
common)? How (I'm missing it)?
Noel
> From: "G. Brandn Robinson"
> C was a language for people who wanted to get that crap out of the way
> so they could think about binary representations.
Huh? Sometimes C gets in the way of doing that; see below.
> From: Dan Cross
> They did indicate that alignment makes sharing _binary_ data between
> VAX and PDP-11 harder, but that's truerepresentation of other aspects of product
> types as well.
Alignment is just one aspect of low-level binary representation; there's also
length (in bits), which is critically important in several problem domains;
device registers have already been mentioned, but more below.
> From: Peter Yardley
> Problems I have with C as a systems language is there is no certainty
> about representation of a structure in memory and hence when it is
> written to disk.
That's yet another one.
The area I'm thinking of (and which I saw a lot of) is code to implement
network protocols (and I'm fairly astounded that nobody else has mentioned
this yet). One has to have _absolute_ control over how the bits are laid out
in the packet (which of course might wind up in any one of countless other
machine types) - which generally means how they are laid out in memory.
The whole concept of C declarations is not rich enough to really deal with
this properly. For each field in the header, one absolutely needs to be able
to separately specify the syntax (e.g. size in bits) and semantics (unsigned
integer, etc).
And if you want the code to be portable, so that one set of sources will
compile to working code on several different architctures, it gets worse.
Device registers, already mentioned, often only have to run on one type of
machine, but having protocol implementions run on a number of different
machine types is really common.
I came up with a way to do this, without _any_ #ifdefs (which I don't like,
for a reason I won't get into) in most source files. Dealing with byte order
issues was similarly dealt with (one can't deal with it just in types, really,
without making the type specification, and the language, somewhat
complicated).
I know later C's got better about richer variable semantics and syntax
selection than the circa 1985 ones I was working with, but I don't think it
was ever made completely simple and orthogonal (e.g.
'signed/unsigned/boolean/etc char/short/long/quad/word/etc') as it should
have been.
Noel
Given that anything that obeys the ABI and has assembler entries to the kernel
can request services, it seems to me it would be possible to stand up a
user-land without C being present. Have any UNIXen ever done this after the
advent of C?
- Matt G.
> Everything that can possibly be represented in a machine
> register with all bits clear shows up as an integral zero literal.
> '\0' == 0 == nullptr == struct { } == union { }
Well, some things.
0.0f and other floating-point zero constants are represented
by all-zero words (of various sizes) and are not integral constants.
NULL does not "show up as an integral zero literal".
0==NULL is true only because 0 can be converted to NULL.
Getting really lawyerly, one can cook up any number of
bizarre "things that can possibly be represented" by an
all-zero word, for example (char[4]){0,0,0,0}, and have
no representation as an integral constant.
Only 3 of the 5 examples fit the description of possibly being
represented by an all-zero word.
struct{} and union{} are gnu extensions with size zero. Even
if you accept them as C, they have no machine representation
and cannot be cast to int.
The null pointer makes the list only thanks to the weasel-word
"possibly". Although 0 can be cast to the null pointer, the
result of casting a null pointer to int depends on its unspecified
machine representation. Zero, of course, is a good choice
because it's easy to test for, and is easy to omit from virtual
address spaces.
Doug
Hello Anyone interested in this silliness , I am just recently trying
to reacquaint myself with this os . Which I had a decent passing knowledge of
at one time . Not any real OS level or driver coding , But was least decently
acquainted . Now on with the preliminaries ...
Any good software items to update on this ol'thing that give me a better chance
of completing this task , Greatly welcome .
I have folowed , ths article which is a copy from (imo) several places , tho
all of them are using a axp-Emulator .
<https://gist.github.com/jamesy0ung/eeac82997ebeae92873d1f2844a14ac3>
I am using (I'll admit) a REAL AlphaStation 200 (4/100) with 384MB main memory &
three disks all are U160's 2x4G+1x72G , OS is installed on the 72G(now) & has a
/home dir for users rather that the default location . See info & error during
make of gcc . Those numbers for the allocation & total have been exactly the
same accross many iterations of attempts in that exact file .
# sizer -v
HP Tru64 UNIX V5.1B (Rev. 2650); Sun Feb 23 19:43:32 AKST 2025
It is at patch level 008 .
and had successfully compiled & installed all the prerequisites shown in the
article mentioned above .
It Seems the OS doesn't know how to access the swap properly .
root@as200:/home/buildnfs/gcc-4.4.7# env PATH=/usr/local/bin:/sbin:/usr/sbin:/usr/bin:/usr/ccs/bin:/usr/bin/X11:/usr/dt/bin:~/bin:. make
... many lines snipped ...
/home/buildnfs/gcc-4.4.7/host-alpha-dec-osf5.1b/prev-gcc/xgcc
-B/home/buildnfs/gcc-4.4.7/host-alpha-dec-osf5.1b/prev-gcc/
-B/usr/local/alpha-dec-osf5.1b/bin/ -c -g -$
cc1: out of memory allocating 135816 bytes after a total of 796519376 bytes
make[3]: *** [fold-const.o] Error 1
make[3]: Leaving directory `/home/buildnfs/gcc-4.4.7/host-alpha-dec-osf5.1b/gcc'
make[2]: *** [all-stage2-gcc] Error 2
make[2]: Leaving directory `/home/buildnfs/gcc-4.4.7'
make[1]: *** [stage2-bubble] Error 2
make[1]: Leaving directory `/home/buildnfs/gcc-4.4.7'
make: *** [all] Error 2
# swapon -s
Swap partition /dev/disk/dsk2g:
Allocated space: 249774 pages (1.91GB)
In-use space: 1520 pages ( 0%)
Free space: 248254 pages ( 99%)
Swap partition /dev/disk/dsk1b:
Allocated space: 49152 pages (384MB)
In-use space: 1630 pages ( 3%)
Free space: 47522 pages ( 96%)
Swap partition /dev/disk/dsk0b:
Allocated space: 49152 pages (384MB)
In-use space: 1618 pages ( 3%)
Free space: 47534 pages ( 96%)
Total swap allocation:
Allocated space: 348078 pages (2.66GB)
In-use space: 4768 pages ( 1%)
Available space: 343310 pages ( 98%)
# hwmgr show scsi
SCSI DEVICE DEVICE DRIVER NUM DEVICE FIRST
HWID: DEVICEID HOSTNAME TYPE SUBTYPE OWNER PATH FILE VALID PATH
-------------------------------------------------------------------------
42: 0 as200 cdrom none 0 1 cdrom0 [0/4/0]
43: 1 as200 disk none 2 1 dsk0 [1/0/0]
44: 2 as200 disk none 2 1 dsk1 [1/1/0]
45: 3 as200 disk none 2 1 dsk2 [1/2/0]
# hwmgr -view dev
HWID: Device Name Mfg Model Location
------------------------------------------------------------------------------
3: /dev/dmapi/dmapi
4: /dev/scp_scsi
5: /dev/kevm
29: /dev/disk/floppy0c 3.5in floppy fdi0-unit-0
42: /dev/disk/cdrom0c TOSHIBA DVD-ROM SD-M1401 bus-0-targ-4-lun-0
43: /dev/disk/dsk0c IBM DDRS-34560D bus-1-targ-0-lun-0
44: /dev/disk/dsk1c COMPAQ BD07286224 bus-1-targ-1-lun-0
45: /dev/disk/dsk2c COMPAQ ST34371W bus-1-targ-2-lun-0
46: /dev/random
47: /dev/urandom
Tia , JimL
--
+---------------------------------------------------------------------+
| James W. Laferriere | System Techniques | Give me VMS |
| Network & System Engineer | 3237 Holden Road | Give me Linux |
| jiml(a)system-techniques.com | Fairbanks, AK. 99709 | only on AXP |
+---------------------------------------------------------------------+
Yufeng,
> I've recently brought the "prestruct-c" compiler back to "life"
Great archeology! It seems you've unearthed a snapshot from the brief
period when Dennis was struggling to reconcile byte addressing with BCPL
pointers--the seminal innovation of C. In characteristic Unix fashion, he
was trying out his ideas as they developed.
I had forgotten that product types were under con-struct-ion at the same
time. That really was a big bang.
Doug
Hi again,
I've recently brought the "prestruct-c" compiler back to "life" (https://github.com/TheBrokenPipe/C-Compiler-Dec72) and thought it might be worth documenting here. One thing I have to say first - it's barely working and probably never worked to begin with.
There were some efforts in the distant past to revive this compiler; however, the compiled compiler never worked. The reasons are as follows:
- The compiled executable is too big (exceeds 32K, making pointers effectively negative). This triggers a bug in the liba I/O routines.
- The compiler assumes an origin of 0 and writes temp data at the NULL pointer. Without an MMU, this kills the interrupt vectors and possibly the kernel on the 11/20.
- The compiler is missing ALL code/tables written in assembly language. This is pretty fatal, and internal changes rendered files from the last1120c compiler incompatible.
- Calling convention changes make the s2/last1120c libc library incompatible.
I'm a big fan of the C programming language, and the reason I was so insistent on getting this compiler to work is that it has a funny struct syntax not seen in any other C compiler. Structs are defined like:
struct name (
type field;
...
);
... with round brackets (parentheses) instead of curly braces.
Another notable thing introduced in this compiler is that certain things are no longer lvalues. In the past (B and last1120c), functions, labels, and arrays were lvalues, meaning they could be assigned. For instance, this code:
func1() { return (1); }
func2() { return (2); }
main() {
printf("func1() = %d\n", func1());
printf("func2() = %d\n", func2());
printf("func1 = func2\n");
func1 = func2;
printf("func1() = %d\n", func1());
}
produces the output:
func1() = 1
func2() = 2
func1 = func2
func1() = 2
This code:
main() {
second = first;
goto second;
first:
printf("first\n");
second:
printf("second\n");
}
produces the output:
first
second
And this code:
main(argc, argv) {
int arr1[10];
int arr2[10];
arr1[0] = 5;
arr2[0] = 8;
printf("arr1[0] = %d\n", arr1[0]);
printf("arr2[0] = %d\n", arr2[0]);
printf("arr1 = arr2\n");
arr1 = arr2;
printf("arr1[0] = %d\n", arr1[0]);
}
produces the output:
arr1[0] = 5
arr2[0] = 8
arr1 = arr2
arr1[0] = 8
Now, the rules of the game have changed with the prestruct-c compiler, and these are no longer lvalues. I don't know why this change was made, but if I had to guess, speed was probably the biggest driving factor, with security also playing a role. Anyhow, they're no longer lvalues, so there's now one less level of indirection involving functions and labels. This means the codegen tables from last1120c have to be modified to suit this compiler change.
However, even with it generating the correct code, there is still one fatal problem - the libc. The libc from s2/last1120c was designed for the older compiler and therefore has one extra layer of indirection for functions. Luckily, the source code of the libc is available on the last1120c tape, and it wasn't too much work to remove the indirection manually.
Okay, what else? Well, this compiler also seems to be the first to introduce the "modern" pointer syntax. Before this compiler, pointers were declared using the same syntax as arrays/vectors, like "char name[];". This compiler introduced the "modern" syntax of "char *name;". No big deal, right? Well, the compiler itself was written using the old syntax, meaning it cannot compile itself. I think this indicates that this compiler (or the new syntax) was so unstable that the production compiler still used the old syntax.
With everything carefully put back into place, I managed to get this to work:
struct foo (
char x;
int y;
char *z;
);
main(argc, argv)
char **argv;
{
struct foo bruh;
bruh.x = 'C';
bruh.y = 123;
bruh.z = "test";
printf("x = '%c', y = %d, z = \"%s\"\n", bruh.x, bruh.y, bruh.z);
}
However, if I rename the variable "bruh" to something like "bar", it throws the error "Unimplemented pointer conversion". I have no clue why. I've also never gotten struct pointers to work - it always complains about "Illegal structure ref" when I try to use "->". It also seems to accept "." on structure pointers (and does not actually dereference the pointer when accessing the members), so something is probably very wrong with referencing and dereferencing.
Anyway, there are plenty of other issues with the compiler, like the code may not compile correctly with pointers and switch statements. I'm not sure if the issues are caused by my poor reconstruction of the assembly tables, or if the compiler itself never worked properly in the first place (or both). Either way, I've managed to get it to spit out a correct hello world, as well as the struct test above, so I think I've fulfilled my goal of seeing this compiler "work".
The code, build instructions and pre-built binaries are here:
https://github.com/TheBrokenPipe/C-Compiler-Dec72
Ideally, it runs under a PDP-11/45 environment with 0 as the origin and generates code for the PDP-11/45. However, I made it target the 11/20 since I couldn't get the 11/45 toolchain to work, and I haven't implemented 11/45 instructions in my simulator yet. If anyone wants to pick up the baton and get it working for the 11/45 or fix my bugs, be my guest!
Sincerely,
Yufeng