[TUHS] mental architecture models, Anyone ever heard of teaching a case study of Initial Unix?

Sun Jul 7 00:02:22 AEST 2024

On Sat, 6 Jul 2024, sjenkin at canb.auug.org.au wrote:
> C wasn’t the first standardised coding language, FORTRAN & COBOL at least were before it,
> so there were multi-platform source libraries and shared source, though often single platform.
>
> From what I know, vendor extensions of FORTAN, optimised for their hardware, were common,
> making high-performance, portable source difficult or impossible. 6-bit and 8-bit chars were the least of it.

Even without vendor extensions, writing portable Fortran code was hard. 
Different floating point formats give you different results, and 
architectural differences can bite you.  One famous example is that the 
709x required word alignment, but S/360 had 4 byte aligned floats and 8 
byte aligned doubles, so this:

       REAL R(100)
       DOUBLE PRECISION D(10)
       EQUIVALENCE (R(2), D(1))

would work fine on a 7090 but crash on a 360.  That was painful enough 
that one of the first things they changed on S/370 was to allow misaligned 
data.

I never wrote much COBOL but it had structured data (the ancestor of C 
structs) and "redefines" for overlaying structures which could bite you 
when different machines had different size or alignment.  There were also 
a lot of different character sets which led to bugs when code had implicit 
assumptions about collating sequences, e.g., do numbers come before 
letters as in ASCII, or after as in EBCDIC.

The fact that everything now has 8 bit byte addressed memory with power of 
two data sizes and everything is ASCII makes all these problems go away.

> Is this right:
>
> 	C was the first ’systems tool’ language + libraries available across many platforms.
> 	Notionally, source code could be ported with zero or minimal change.
> 	It made possible portable languages like PERL, PHP, Python.

I think so.  There were previous system languages like a PL/I subset on 
Multics or PL/S on IBM or PL/M on micros but I don't think any of them had 
multiple targets.

> Secondly, portable systems tool languages with a common 2-part design
> of parser/front-end providing an abstract syntax tree
> to multiple back-ends with platform specific code-generators.
>
> Are these back-ends where most of the assembler, memory model and instruction optimisation take place now?

That's the standard way to build a compiler.  Back in the late 1950s 
someone had the bright idea to invent a common intermediate language they 
called UNCOL so all of the front ends could produce UNCOL and all of the 
back ends could translate from UNCOL thereby reducing the NxM compiler 
problem to N+M.  It never worked, both because the semantic differences 
between source languages are larger than they look, and the machine 
architectures of the era were wildly different.

Now we have GCC and LLVM which are sort of UNCOL-ish, but mostly because 
the back ends are all so similar.  The instruction sets may be different 
but the data formats are all the same.

R's,
John