History of popularity of C

List overview All Threads
Download

newer

older

Re: [TUHS] History of popularity...

My BSDcan talk

Tyler Adams

21 May 2020 21 May '20

3:27 p.m.

Does anybody have any good resources on the history of the popularity of C? I'm looking for data to resolve a claim that C is so prolific and influential because it's so easy to write a C compiler. Tyler

Attachments:

attachment.html (text/html — 303 bytes)

Show replies by date

Toby Thain

21 May 21 May

4:10 p.m.

On 2020-05-21 11:27 AM, Tyler Adams wrote:

...

Based on recollections of C from mid-1980s until today, this claim doesn't make sense for several reasons. Sorry, this is all anecdata or recollection, not cited data: - inexpensive compiler availability was not very good until ~1990 or later, but C had been taking off like wildfire for 10 years before that - developing good compilers is certainly not "easy" - and there were a lot of mediocre vendor compilers despite (duplicated) investment - by the time gcc was mature (by some definition, but probably before 1990) - something that happened largely as a reaction to the vendor compiler situation - it was a large and complicated codebase even by standards of the time - hobby/novelty/small/educational compilers are a relatively new thing and arrived long after the C adoption curve was complete. The earliest well known example I can think of is lcc (1994) but most are much newer. ...and probably quite a few other points. --T

Larry McVoy

4:30 p.m.

On Thu, May 21, 2020 at 12:10:35PM -0400, Toby Thain wrote:

...

On 2020-05-21 11:27 AM, Tyler Adams wrote:

This matches my memory as well. I think I learned C in 1983 or 84, it just worked. To me it felt like it was PDP-11 assembler only nicer. The thing I liked about C is that you always felt like you were right on the metal, it didn't hide the fact that there was a computer under it. Very different feel from, say, Pascal. I think the fact that you could feel the machine under the language had a lot to do with it taking off. And what Toby said about compilers, oh, man, so true. Once you got out in the real world, gcc was buggy and slow, companies wanted to charge you at every step of the way for compilers that were marginally better than gcc at best. When gcc finally got good enough, I agree, around 1990 or so, it was a relief. You just used it and ignored the platform specific ones. G++ took a long time to be good enough.

John Foust

5:22 p.m.

At 11:30 AM 5/21/2020, Larry McVoy wrote:

...

This matches my memory as well. I think I learned C in 1983 or 84, it just worked. To me it felt like it was PDP-11 assembler only nicer.

One thing that stuck with me about our experience at UW-Madison at that time was that there wasn't a course that taught C yet some courses were taught in C. "Here's K&R, there's the Unix manuals, get to it."

...

When gcc finally got good enough, I agree, around 1990 or so, it was a relief. You just used it and ignored the platform specific ones. G++ took a long time to be good enough.

There's the broader history of the languages that were popular in the IBM PC market in the 80s and 90s, too. In that at least numerically larger market, there were times when C was not on top for many small-time developers. Let's not forget Turbo Pascal (shipped 1983 to 1995) and Turbo C and C++ (1987-1995). In 1986 or so on the PC, I was using the Gimpel C-terp interpreted C and their fine PC-lint to speed development (which Clem Cole has mentioned here before and which is still sold (!) ) in conjunction with shipping code under the Lattice and Microsoft C compilers of that time. In the mid- to late 80s, there's the rise of the flat address space 68000 machines like Amiga and Atari which could enjoy the cross-pollination of code ported from Unix C environments. On the Mac, Apple's MacApp environment was their Object Pascal and not C++ until 1991. Think C came out in 1986. In the late 1980s, 32-bit DOS extenders arose that let you write DOS programs in C that had true 32-bit pointers and didn't need to worry about 64K segments as much, followed by Microsoft's Win32s in late 1992 that allowed that freedom under Windows 3.1. - John

Toby Thain

8:17 p.m.

On 2020-05-21 1:22 PM, John Foust wrote:

...

... In the mid- to late 80s, there's the rise of the flat address space 68000 machines like Amiga and Atari which could enjoy the cross-pollination of code ported from Unix C environments. On the Mac, Apple's MacApp environment was their Object Pascal and not C++ until 1991. Think C came out in 1986.

Few developers used MacApp, afaicr. Most used plain Pascal - initially Lisa Pascal, though that was before my time on Mac - and those who didn't like Pascal could write C on Mac before THINK, using tools like Aztec C (or even Whitesmiths, one of the earliest). Even though MPW was an excellent industrial strength environment with good Pascal and C compilers, the big vendors like Adobe adopted LIGHTSPEED-then-THINK-then-Symantec C quickly and rewrote Pascal apps (like Photoshop) in C early. Then CodeWarrior came along and ate THINK's lunch. --Toby

...

... - John

Tony Finch

4:43 p.m.

Toby Thain <toby(a)telegraphics.com.au> wrote:

...

- inexpensive compiler availability was not very good until ~1990 or later, but C had been taking off like wildfire for 10 years before that

I get the impression that an important part of its popularity was how C (and C++) became the language of choice on the PC, and displaced Pascal in the process. Tony. -- f.anthony.n.finch <dot(a)dotat.at> http://dotat.at/ Trafalgar: Northerly or northwesterly, backing southwesterly in northwest, 3 to 5. Moderate, occasionally slight in southeast. Fair. Good.

arnold＠skeeve.com

5:35 p.m.

...

Toby Thain <toby(a)telegraphics.com.au> wrote: > > - inexpensive compiler availability was not very good until ~1990 or > later, but C had been taking off like wildfire for 10 years before that

PCC contributed to this. Everybody and their brother was porting Unix to their fancy new CPU architecture / hardware. All you had to do was bootstrap a cross-compiler version of PCC on a PDP-11 (or more likely Vax), then get Unix to boot and Voila. (I remember reading a paper about how Motorola did just that for the MC 680x0 family.) C and Unix were established in Academia and Industry well before 1990.

...

I get the impression that an important part of its popularity was how C (and C++) became the language of choice on the PC, and displaced Pascal in the process.

C++ became the language of choice on the PC when MSFT started pushing its compiler and Visual Studio IDE. At least, this is my two cents. Arnold

CHARLES KESTER

7:16 p.m.

...

On May 21, 2020 at 10:35 AM arnold(a)skeeve.com wrote: C++ became the language of choice on the PC when MSFT started pushing its compiler and Visual Studio IDE.

Microsoft C 7.0 already had a C++ compiler and an early version of MFC in 1992. But you're right: it was when Visual C++ 1.0 came out in 1993 that C++ became really popular among developers targeting Windows. VC1.0 introduced "wizards" for MFC that produced a skeleton application to which many people had to make only a few additions in order to come up with a shippable product. The market was soon flooded with apps that had what I called a "wizard smell". (The more charitable phrase was "look and feel".) Of course, as with all framework-based code, wizard-generated apps couldn't distinguish themselves in the market for very long and the bar was raised. But by then C++ was well-established as the language of choice. None of which has anything to do with Unix, I admit.

Thomas Paulsen

8:33 p.m.

...

msc was really good in those days. As a systems guy I used to study its generated assembly code which was extremely good. However today's gcc uses advanced instructions too, thus also very good, whereas all the unix cc's of the 90ths known to me were rather naive, simple lex&yacc derived. The "wizards" also were very good making gui programmig much easier.

Toby Thain

8:09 p.m.

On 2020-05-21 1:35 PM, arnold(a)skeeve.com wrote:

...

Toby Thain <toby(a)telegraphics.com.au> wrote: > > - inexpensive compiler availability was not very good until ~1990 or > later, but C had been taking off like wildfire for 10 years before that

...

(I remember reading a paper about how Motorola did just that for the MC 680x0 family.)

Yes, but Johnson had already done the work. Imho compilers were still considered pretty complex magic and you wouldn't lightly write one from scratch. And yeah all the vendors wanted to get a compiler out with minimal effort, which is why they often weren't very good.

...

C and Unix were established in Academia and Industry well before 1990.

I get the impression that an important part of its popularity was how C (and C++) became the language of choice on the PC, and displaced Pascal in the process.

C++ became the language of choice on the PC when MSFT started pushing its compiler and Visual Studio IDE.

That was much later. --Toby

...

At least, this is my two cents. Arnold

Tony Finch

8:12 p.m.

arnold(a)skeeve.com <arnold(a)skeeve.com> wrote:

...

I get the impression that an important part of its popularity was how C (and C++) became the language of choice on the PC, and displaced Pascal in the process.

C++ became the language of choice on the PC when MSFT started pushing its compiler and Visual Studio IDE.

C was winning years before that. I saw a comment on a certain orange website that referred to Dr Dobbs Journal, August 1986, which I found online at https://archive.org/details/dr_dobbs_journal_vol_11/page/n541/mode/1up On that page there are a few choice quotes from the archives (1983) about C from a PC perspective. The letters pages are 1/3 C. There are 8/10 pages of articles about C. Then there is a 23 page comparative review of 17 C compilers. It's remarkable :-) Tony. -- f.anthony.n.finch <dot(a)dotat.at> http://dotat.at/ Irish Sea: Southeast 3 or 4, increasing 5 to 7, veering southwest 6 to gale 8 later. Smooth or slight, becoming moderate or rough, occasionally very rough later in south. Fair then rain or squally showers. Good occasionally poor.

David Arnold

22 May 22 May

8:28 a.m.

On 22 May 2020, at 03:37, arnold(a)skeeve.com wrote: <...>

...

C++ became the language of choice on the PC when MSFT started pushing its compiler and Visual Studio IDE.

On the PC side, TurboPascal started to get displaced by Borland C++ I think in the early 90’s. I don’t have a good feeling why, but perhaps it was the parallel evolution of Microsoft’s C & C++, which were doing pretty well even before 1997 when Visual Studio began its rise. Watcom C++ was also around, iirc it was available for OS/2 as well? On the Unix side, the egcs fork of gcc pushed it forward a lot and the subsequent reverse takeover of gcc saved it from needing replacement far earlier. Of course the commercial Unix vendors charging for their compilers helped gcc too, and by then Pascal, Modula/2/3, Ada ... everything else had become a niche market. I don’t recall any hard data from back then though, sorry ... d

Toby Thain

21 May 21 May

8:07 p.m.

On 2020-05-21 12:43 PM, Tony Finch wrote:

...

Toby Thain <toby(a)telegraphics.com.au> wrote:

- inexpensive compiler availability was not very good until ~1990 or later, but C had been taking off like wildfire for 10 years before that

I get the impression that an important part of its popularity was how C (and C++) became the language of choice on the PC, and displaced Pascal in the process.

Yes, that's basically true, but I didn't try to cover the contemporary "appeal" of C, stylistic or otherwise - but only the compiler point (which I think is mostly false). --Toby

...

Tony.

Clem Cole

8:56 p.m.

On Thu, May 21, 2020 at 12:17 PM Toby Thain <toby(a)telegraphics.com.au> wrote:

...

- inexpensive compiler availability was not very good until ~1990 or later,

Hrrumpt The Gnu C compiler was starting to be available by the mid-1980s in alpha/beta form. rms was looking for places to start. He approached a number of folks, from Tanenbaum to some of the vendors (he knew Masscomp had written a compiler from scratch which we away the binaries gave to our customers and he called me asking if we would donate it. We had donated development hardware and I was still his contact to the Gnu project at that point). As far as I know, he ended up writing his own because he could not find one to start with. The big kickstart for rms, was that Sun hard just started to charge for its compilers, and so a lot of people were looking for a free alternative (and frankly in those days the Sun compiler was still a bit of a toy -- 20% we got over them at Masscomp was because we had a number of the folks from the DEC compiler team). It is true that the targets and the original systems it ran were more limited. The 1.0 release was before the summer of '87 (in May maybe???). The biggest issue is that it did not run on DOS until the 386 and the DOS-extenders show up. But it covered the many 68000 workstations and was often as good or better than the supplied one [which were mostly based/derived from the MIT Jack Test port of the Johnson compiler for the NU system].

...

but C had been taking off like wildfire for 10 years before that

At least 15 years before. By 1975, it was a solid fixture at most Universities.

...

- by the time gcc was mature (by some definition, but probably before 1990)

Mature is the key word here. gcc does not really start to mature until Cygnus takes it over. But it was quite usable for the systems that targetted it.

Toby Thain

11:45 p.m.

On 2020-05-21 4:56 PM, Clem Cole wrote:

...

On Thu, May 21, 2020 at 12:17 PM Toby Thain <toby(a)telegraphics.com.au <mailto:toby@telegraphics.com.au>> wrote: - inexpensive compiler availability was not very good until ~1990 orlater, Hrrumpt The Gnu C compiler was starting to be available by the mid-1980s in alpha/beta form. rms was looking for places to start. He

Right, things were changing, but costly C compilers were a reality well into the 90s, unless your use case happened to coincide with a gcc port. And the reason this matters is that it contradicts the "C is popular because compilers were easy" assertion. Not "easy", and not necessarily cheap or free either.

...

approached a number of folks, from Tanenbaum to some of the vendors (he knew Masscomp had written a compiler from scratch which we away the binaries gave to our customers and he called me asking if we would donate it. We had donated development hardware and I was still his contact to the Gnu project at that point). As far as I know, he ended up writing his own because he could not find one to start with. ... but C had been taking off like wildfire for 10 years before that At least 15 years before. By 1975, it was a solid fixture at most Universities.

Yes. I should have said "more than 10" :-) --Toby

...

- by the time gcc was mature (by some definition, but probably before1990) Mature is the key word here. gcc does not really start to mature until Cygnus takes it over. But it was quite usable for the systems that targetted it.

Richard Salz

11:57 p.m.

Was the fact that gcc had the "portable" RTL as an intermediate representation important? That it was designed to be ported. And what about John Gilmore making all bsd user it? And the multiple usenix tutorials?

Toby Thain

22 May 22 May

12:17 a.m.

On 2020-05-21 7:57 PM, Richard Salz wrote:

...

Regardless of one's opinions on the ubiquity of gcc it wasn't mature and accessible until at least 10-15 years after C was already popular (depending how you count). And gcc is hardly an "easy" compiler project ... to the OP's question. --Toby

John Gilmore

4:10 a.m.

Richard Salz <rich.salz(a)gmail.com> wrote:

...

And what about John Gilmore making all bsd user it? And the multiple usenix tutorials?

I think Rich is referring to the time in 1987-8 when I spent some time compiling the entire BSD distribution sources with the Vax version of gcc. This was a volunteer effort on my part so that Berkeley could adopt GCC to replace PCC. They got an ANSI C compiler, and avoided AT&T copyright restrictions on Yet Another critical piece of Berkeley Unix. GNU got an extensive test of GCC which moved it out of "beta" status. I ended up taking extensive notes, and wrote a 1988 paper about the experience, which I submitted to USENIX. But it was rejected, on the theory that porting code (even ancient crufty Unix code) through new compilers wasn't research. Indeed, I recall Kirk McKusick remarking to me around that time that even Unix kernel ports to new architectures were so routine as to not be research in his opinion. Oddly, I was easily able to find that paper (thanks to Kryder's Law), so I have appended it verbatim below (in troff with -ms macros). In short, I found about a dozen bugs in GCC, which RMS fixed; and many hundreds of bugs in the 4.3BSD Unix sources, which I fixed and Keith merged upstream. Note the quaint footnoted homage to distributed collaboration, which was still novel back then in the pre-Covid, pre-public-Internet, 2400 baud modem era. John .TL Porting Berkeley .UX through the GNU C Compiler .AU John Gilmore .AI Grasshopper Group San Francisco, CA, USA 94117 gnu(a)toad.com .AB We have ported UC Berkeley's latest .UX sources through the GNU C Compiler, a free draft-ANSI compatible compiler written by Richard Stallman and available from the Free Software Foundation. In the process, we made Berkeley .UX more compatible with the draft ANSI C standard, and tested the GNU C Compiler for its full production release. We describe the impact of various ANSI C changes on the Berkeley .UX sources, the kinds of non-portable code that the conversion uncovered, and how we fixed them. We also briefly explore some limitations in the tools used to build a .UX system. .AE .SH Introduction .PP The GNU C Compiler (GCC) is a complete C compiler, compatible with the draft ANSI standard, and available in source from the Free Software Foundation (FSF). It was written by Richard Stallman in 1986 and 1987, and is (at this writing) in its 18th release. It is a major component of the GNU (``GNU's Not .UX '') project, whose aim is to build a complete .UX -like software system, available in source to anyone who wants it. The compiler produces good code \(em better than most commercial compilers \(em and has been ported to the Vax, MC680X0, and NS32XXX. .PP Berkeley .UX , from the Computer Systems Research Group (CSRG) at the University of California at Berkeley, had its start in the 1970's with a prerelease .UX Version 7, and has been improving ever since. The current sources derive from the 1978 AT&T ``32V'' release, a V7 variant for the Vax. CSRG has produced four major releases for the Vax \(em 3, 4.1, 4.2, and 4.3BSD. These releases have set the standard for high powered .UX systems for many years, and continue to offer an improved alternative to the flat-tasting AT&T .UX releases. .PP However, Berkeley's C compiler is based on an old version of PCC, the Portable C Compiler from AT&T. There was little chance that anyone would provide ANSI C language extensions in this compiler, or do significant work on optimizing the generated code. By merging the GNU C compiler into the Berkeley release, we provided these new features to Berkeley Unix users at a low cost, while offering the GNU project an important test case for GNU C. .SH Goals .PP The major goal for the project is to move GCC out of ``beta test'' and into ``production'' status, by demonstrating that a successful .UX port can be based on it. .PP We are also providing a better maintained compiler for Berkeley .UX . GCC already produces better object code then the previous compiler, has a more modern internal structure, and supports useful features such as function prototype declarations. It is also maintained by a large collection of people around the world, who contribute their fixes and enhancements to the master sources. Regular releases by the Free Software Foundation encourage distribution of the improvements. In contrast, PCC is proprietary to AT&T, and few fixes are widely distributed, except as part of infrequent and expensive AT&T releases. .PP We are producing a .UX source tree which can be compiled by .I both the old and the new compilers. This is partly for convenience during the port, partly in case the project suffers long delays, and partly because Berkeley .UX also runs on the Tahoe, a fast Vax-like machine built by Computer Consoles, which GCC does not yet support. We are avoiding the introduction of new .B #ifdef 's, instead rewriting the code so that it does not depend on the features of either compiler. .PP We have to constantly remind ourselves to minimize the changes required. It's too easy to get lost in a maze of twisty .UX code, all desperately needing improvement. .PP Whenever we have to make a change, we have moved in the direction of ANSI C and POSIX compatability. .SH People .PP The project was conceived by John Gilmore, and endorsed by Keith Bostic and Mike Karels of CSRG, and Richard Stallman of FSF. John did the major grunt work and provided fixes to the .UX code. Keith and Mike provided machine resources, collaborated on major decisions, and arbitrated the style and content of the changes to .UX . Richard provided quick turnaround on compiler bug fixes and problem solving. This setup worked extremely well. .PP We started work on 17 December 1987, and are not yet done at the time of writing (19 February 1988). About 9 days of my time, 2 of Keith's, half a day of Mike's, and XXX days of Richard's have gone into the project so far. .SH Working Style .PP Most of the work was done over networks, in a loosely coordinated style which was hard to concieve of only a few years ago.\(dg .FS \(dg Much of the free software work that is happening these days occurs in this manner, and I would like to publicly thank the original DARPA pioneers who gave birth to this vision of wide area, computer mediated collaborative work. .FE John worked in San Francisco, Keith in Berkeley, and Richard in Cambridge. Keith set up an account and a copy of the source tree on .I vangogh , a Vax 8600 at Berkeley. John spent a few days in front of a Sun at Berkeley getting things straight, but did most of the work by dialing in at 2400 baud from his office in San Francisco. When we modified .UX source files, Keith checked the changes and merged them back into the master .UX sources on another machine at Berkeley. When we found an apparent bug in GCC, we isolated a small excerpt or test program to demonstrate the bug, and forwarded it to Richard by Internet electronic mail. Bug fixes came back as new GCC releases, which were FTP'd over the Internet from MIT. Ongoing status reports, discussions, and scheduling were done by \fIuucp\fP and Internet electronic mail. .PP At this writing, we have used four GCC releases (1.15 through 1.18). For each GCC release, we did a ``pass'' over the .UX source tree; one such pass included an updated source tree as well. Each GCC release was built, tested, and installed on .I vangogh without trouble. Then we ran .I "make clean; make" on the source tree, and examined 500K to 800K of resulting output. Keith Bostic's Makefiles did an excellent job of automating this process, though we ran into some problems with the .UX compilation model in general, and limitations in .I make in particular. .SH ANSI Language Changes .PP The problems encountered during the port fell into two general categories. Some of the code was not written portably and failed in the new environment. Other code was written portably for its time, but failed because ANSI C has redefined parts of the language. In some cases it was hard to tell the difference; the consensus on what is ``portable code'' changes over time, and on some points there is no agreement. .PP The major ANSI C problem was the generation of .B "character constants in cpp" . The traditional .UX C preprocessor (\fIcpp\fP), written by John F. Reiser, would substitute a macro's parameters into like-named substrings even inside single or double quotes in the macro definition. For example: .DS #define CTRL(c) ('c'&037) #define CEOF CTRL(d) .DE In an attempt to make things easier for tokenizing preprocessors, ANSI C has changed the rules here, and there is in fact .I no way to generate a character constant containing a macro argument. (There is a way to generate a character .I string , e.g. double-quoted string, but not a single-quoted character. We consider this a bug in ANSI C.) Fixing this required altering both the macro definition and each reference to the macro: .DS #define CTRL(c) (c&037) #define CEOF CTRL('d') .DE This required changes in about 10 system include files and in about 45 source modules. Many user programs turned out to depend on the undocumented .B CTRL macro, defined in .B <sys/ttychars.h> , and since all its callers had to change, all those programs did too. .PP Another \fIcpp\fP problem involved .B "token concatenation" . No formal facilities were provided for this in the old \fIcpp\fP, but many users discovered that with code like this, from the /etc/passwd scanning code: .DS #define EXPAND(e) passwd.pw_/**/e = tp; while (*tp++ = *cp++); EXPAND(name); EXPAND(passwd); .DE they could cause a macro argument to be concatenated with another argument, or with preexisting text, to make a single name. In one case (\fIphantasia\fP), the Makefile provided half of a quoted string as a command line .B #define , and the source text provided the other half! ANSI C does not allow a preprocessor to concatenate tokens in these ways, instead providing a newly invented .B ## operator, and new rules requiring the compiler to concatenate adjacent character strings. Again, it was impossible to write a macro that works with both old and new compilers, and we didn't want to uglify our code with .B "#ifdef __STDC__" ; our solution was to rewrite both the macros and all their callers, to avoid ever having to concatenate tokens: .DS #define EXPAND(e) passwd.e = tp; while (*tp++ = *cp++); EXPAND(pw_name); EXPAND(pw_passwd); .DE Mostly the token concatenation was used as a typing convenience, so this was not a problem. It involved changes to five modules. We found no clean solution for .I phantasia ; a fix will probably involve rewriting it to do explicit string concatenations at runtime. .PP Changes to the .B "scope of externals" provided another set of widely scattered changes. If an external identifier is declared from inside a function, PCC causes that declaration to be visible to the entire remaining text of the source file. This also applies to functions which are implicitly declared when they first appear in an expression. This behaviour was not explicitly sanctioned by K&R, but it was condoned (pg. 206, 2nd paragraph), and many programs depended on it. ANSI C changed the scope rules to be more consistent; if you declare an external identifier in a local block, the declaration has no effect outside the block. We moved extern declarations to global scope, or added global function declarations, in 38 files to handle this. .PP A number of programs used .B "new keywords" such as \fIsigned\fP or \fIconst\fP as identifiers. We renamed the identifiers in 9 modules. .PP The Fortran libraries used a \fBtypedef name as a formal parameter\fP to a set of functions. ANSI C has disallowed this, since it complicates the parsing of the new prototype-style function declarations. We renamed the parameter in 8 modules. .PP Three modules used a \fBtypedef with modifiers\fP, e.g.: .DS typedef int CONSZ; x = (unsigned CONSZ) y; .DE This has been repudiated by ANSI C. We fixed it by making the original typedef \fBunsigned\fP where possible, or by creating a second typedef for ``U_CONSZ''. .SH Non-Portable Constructs .PP The worst non-portable construct we found in the .UX sources was the use of .B "pointers to non-members" . There was plenty of code as bad as: .DS int *foo; foo->memb = 5 if (foo->humbug >= -1) bah(); .DE and, in many cases, \fImemb\fP and \fIhumbug\fP are not even members of the same struct! Such code seems to have been written with a ``BCPL'' mentality, assuming that all pointers are really the same thing and it doesn't matter what their type is. Early C implementations lacked the .B union declarator, and did not distinguish between the members of different structures. Exploiting this has been considered bad practice for years, and lint checks for it, though many .UX compilers do not. We found a lot of it in old code, though newer code did not lack for examples either. Fixing this problem caused the most work, because we had to figure out what each untyped or mistyped pointer was .I really being used for, then fix its type, and whatever references to it were inconsistent with that type. We changed 5 modules due to this. One program, \fIefl\fP, would have required so much work that we abandoned it, since we could not find anyone using it. .PP Another problem was caused by existing uses of .B "cpp on non-C sources" . Various assembler language modules were being preprocessed by \fIcpp\fP, probably because there is no standard macro assembler for .UX . These modules are carefully arranged to avoid confusing the old \fIcpp\fP; for example, assembler language comments are introduced by .B # , but indented so that \fIcpp\fP will not treat them as control lines. ANSI \fIcpp\fP's handle white space on both sides of the ``#'', so indentation no longer hides these comments. Also, the ANSI rules to require the preprocessor to keep track of which material is inside single and double quotes and which is outside; the old \fIcpp\fP terminated a character string or constant at the next unescaped newline. Vax assembler language uses unmatched quotes when specifying single ASCII characters, such as in immediate operands. This causes an ANSI \fIcpp\fP to stop processing # directives at that point, until it finds another unmatched quote. We chose to alter the assembler modules to avoid stumbling over these features in ANSI C preprocessors, without fixing the larger problem of using a C-specific preprocessor on non-C text. .PP In addition to embedded C preprocessor statements in assembler sources, we had to deal with .B "asm() constructs" in C source. Some system-dependent routines were written in C with intermixed assembler code, producing a mess when compiled with anything but the original compiler. Other routines, such as .I compress , drop in an .B asm() here or there as an optimization. Still more modules, including the kernel, run a .I sed script over the assembler code generated by the C compiler, before assembling and linking it. There is no general solution to these problems. GCC has added an asm() facility that is independent of the compiler's register allocation strategy, but programs using this are incompatible with the old C compiler. We are investigating a possible fix involving changing all these places to use e.g. .B "#include <machine/inline.h>" which, in GCC, would define inline code containing asm()s, while in PCC, declarations of (slower) external functions would be generated. .PP .I Troff used .B "multi-character constants" in its font tables; we fixed it with a macro for building an int out of two characters. A Fortran library module used the character constant .B 'EOF' , presumably a typo for .B EOF ; and \fIrogue\fP defined the character '\300' as a possible command letter. While ANSI C permits multiple character constants, they are implementation defined, and GCC wisely defines them to be invalid (as the standard should have done). .PP Some programs tried to declare functions or variables, .B "omitting both type and storage class" . This usage is not even valid in K&R, though PCC accepts it. We fixed this in about 15 modules, by adding ``int'' to the declarations. There were two other modules where this check uncovered inadvertent use of ``;'' in a declaration list where ``,'' was intended. .PP GCC provides better error checking in a few ways, and caught a number of bugs caused by misunderstood .B "sign extension" . It warns ``comparison is always 0 due to limited range of data type'' for constructs like: .DS char c; if (c == 0x80) foo(); .DE If a signed character contains the bit pattern 0x80, using it in an expression causes it to be sign-extended to 0xFFFFFF80, which does not equal 0x00000080. Bugs of this sort were fixed, typically by casting the 0x80 to (char), in 5 modules. .PP Changes to the rules for \fBparsing declarations\fP made us fix two modules where the last declaration in a struct was immediately followed by a closing brace, without a semicolon. Three more modules needed changes because the rules for where braces are required in struct or array initializers have changed. Four programs defined a \fBstruct foo\fP and then referenced it as a \fBunion foo\fP, or vice verse. Two programs declared \fBregister struct foo bar;\fP and then took bar's address, which is not allowed for register variables! .PP Thirteen programs had miscellaneous \fBpointer usage bugs\fP fixed. Two more were comparing pointers to \fB-1\fP; these were changed to use zero as a flag value instead. .PP In ANSI C, local variables in use at a .B setjmp() are no longer guaranteed to be preserved when a .B longjmp() occurs, unless they are declared \fBvolatile\fP. This is not a problem for the Vax port, since the Vax longjmp() will continue to restore the registers, but gcc warns about this situation, since code that assumes restoration is not portable. We have not yet worked on fixes for this. .PP Five or ten other miscellaneous bugs were caught and fixed. .SH Least portable .UX code .PP The process of porting software inevitably uncovers a few files that cause a disproportionate share of problems. For our port, the clear winner is .I efl , the Extended Fortran Language, by Stu Feldman. It defines ``\fBtypedef int * ptr;\fP'' in a header file, and then uses a ``ptr'' to point to anything. GCC produced 1600 lines of errors messages on this program alone, and three modules of it caused compiler core dumps. We ended up deciding to abandon support for it rather than attempt to clean it up. .PP A runner-up is .I pcc , the Portable C Compiler itself, by Steven C. Johnson. It caused GCC to coredump twice, tickled another GCC parsing bug, and contained the modified typedef and sign extension problems mentioned above. .PP Third place goes to .I monop , the Monopoly\(dg .FS \(dg Trademark of Parker Brothers .FE game, by Ken Arnold. This program used a variety of typed pointers, but the main pointer to a set of structs was declared as a \fBchar *\fP. Another part of the code initialized an array of struct pointers with integer values, then a small loop at the beginning of the game would read out these integers and replace them with corresponding ``real'' struct pointers. It took about two days to face up to the job and about a day to clean it up. .PP Honorable mention for silly mistakes goes to the .I indent program, by someone at the University of Illinois. It contain the only instance of .B "a + = b" (with a space between + and =), and was the only module to terminate its .B #include directives with a semicolon. It also contained a comparison between a character and the value 0200, a value that a signed 8-bit char can never hold. .SH Results .PP We are pleased with the results so far. Most of the .UX code compiled without problems, and the parts which we have executed are free from code generation bugs. The worst of the ANSI C changes only required roughly fifty modules to be changed, and there were only two problems of this magnitude. A total of twenty bugs in gcc were located so far, and most of them are now fixed. We expected several times this many bugs; the compiler is in better shape than any of us expected. .PP Many minor type problems and ``nit'' incompatabilities with ANSI C have been removed from the .UX sources. .SH Future Results .PP \fI(This section will move to \fBResults\fP for the final paper.)\fP .PP We expect that the size of the .UX binaries will be significantly less than with the previous compiler, but at the current stage of the project we can't easily confirm the expectation. .PP When the system compiled with GCC is in everyday use at Berkeley, GCC will be relabeled as a full production-quality compiler, which will encourage its wider use. .SH Non-Results .PP We have not attempted to make Berkeley .UX fully ANSI C compliant. In particular, we have retained preprocessor comments (#endif FOO) as well as machine-specific \fB#define\fP's (#ifdef vax). GCC supports these features without trouble, even though ANSI C does not. .PP The .UX kernel has not yet been ported to gcc. Other people are working on this, compiling one module at a time and running it for a while before moving on to the next. We will merge their work with ours once we have the rest of the system in a stable state. .PP Pieces of the Portable C Compiler are still being used inside .I "lint, f77" , and .I pc . Eventually someone will write Fortran and Pascal front-ends for gcc; this has already been done for C++. So far nobody has created a GNU \fIlint\fP, but it is an obvious project. .PP CSRG has ported Berkeley .UX to the Tahoe, a fast Vax-like machine built by Computer Consoles and resold by Harris and others. We are looking for someone to do a Tahoe port of gcc, to replace the PCC supplied by CCI. .SH Problems in Building .UX .PP .UX compilers traditionally look in certain global places in the file system for their libraries, include files, etc. This is a problem when cross-compiling, or when building a new .UX release (which almost amounts to the same thing). While it is possible to provide a new default directory for .B #include files, if a source program .B #include s a file that is not in the cross-compilation include files, the C compiler will erroneously use the one from /usr/include. There should be a switch that turns off \fIall\fP the built-in include file and library pathnames, and only uses those specified on the compiler's command line. .PP However, there is still the problem of getting those switches to the compiler's command line. .I Make is a great tool for dealing with one directory's worth of files, but as .UX has evolved, \fImake\fP has not kept up. Indeed, it has fallen behind; Makefiles that worked perfectly well five years ago will no longer work because each manufacturer (AT&T especially) has hacked up their .I make to include harmful, gratuitous, and mutually incompatible changes. The result is that a Makefile that works on your system is unlikely to work on your neighbor's system, unless they are from the same manufacturer, and you happen to use the same login shell. .PP .I Make works poorly on nested directory structures, too. As an example, we could find no way to change ``cc'' to ``gcc'' in all the Makefiles used to build Berkeley .UX (short of text-editing them all). In a single directory, you can say .I "make CC=gcc" , but this change is not propagated to subdirectories. You can manually propagate that change one level by saying .I "make CC=gcc MFLAGS='CC=gcc'" but that only goes one level (at least in Berkeley's version of .I make ). We ended up putting a copy of gcc in a private .I bin directory, named .I cc , and putting that directory on the front of the search path. (When we later wanted to override CFLAGS as well, \fI~/bin/cc\fP became a shell script that invokes .I "gcc -W" ). .PP Another problem with .I make is that even if it was instructed to ignore errors (with -i or -k), it exits if it can't locate a file that something else depends upon. This has the effect of ``pruning'' a potentially large section of the source hierarchy, and the only warning is an unobtrusive message buried among 500K of other output. .PP Of course, if someone was to fix these bugs in \fImake\fP, they would be creating yet another incompatible version. I have been watching the papers on the ``new makes'' and so far there doesn't seem to be one that handles deeply nested source trees in a clean and consistent fashion, or is otherwise so much better than \fImake\fP that it's worth the effort to switch. I think it is time to look for a completely new paradigm for software compilation control. I don't have any major insights on where to go from here, but it is clear to me that .I make and its derivatives have reached their useful limits. .SH Availability .PP These changes will be available to recipients of Berkeley's next software distribution, whenever that is. We will also make diffs available to others involved in porting .UX to ANSI C. We suspect that most of the problems we solved have already been handled in one or another .UX port, but the work had to be duplicated because either it was not sent back to Berkeley or AT&T, or the changes were not accepted. (AT&T has a history of pretending that .UX bugs do not exist, and Berkeley has limited manpower). .SH Future Work .PP Future projects include building a complete set of ANSI C and POSIX compatible include files and libraries (including function prototypes), and converting the existing sources to use them. An eventual goal is to produce a fully standard-conforming .UX system \(em not only in the interface provided to users, but with sources which will compile and run on any standard-conforming compiler and libraries. .PP The success of this collaboration between GNU and CSRG has encouraged further cooperation. Both parties feel that AT&T licensing is a problem; most recipients of CSRG releases have old .UX licenses, and are unwilling to upgrade to more expensive and more onerous AT&T licenses. However, new AT&T releases include some features which would be useful in Berkeley .UX . The GNU project is working to provide early reimplementations of these features, such as improved shells and ``make'' commands. In return, CSRG is working to release software to the public which has previously been held to be `` .UX licensed'' even though it was not derived from AT&T code, such as the implementation of TCP/IP, and many of the Berkeley utility programs. .SH References .LP \fIDraft Proposed American National Standard \(em Programming Language C\fP, ANSI X3.J11, draft of October 1, 1986 (update for new draft when out). CBEMA, 311 First Street NW #1500, Washington DC 20001. .LP \fI4.3BSD Manual Set\fP, Computer Systems Research Group, University of California at Berkeley. .LP Fowler, Glenn S., ``The Fourth Generation Make'', Usenix conference proceedings, Summer 1985, page 159. (More references on ``make'' are provided in this paper.) .LP Hume, Andrew, ``Mk: a successor to make'', Usenix conference proceedings, Summer 1987, page 445. .LP Kernighan, Brian W. and Ritchie, Dennis M., ``\fIThe C Programming Language\fP'', Prentice-Hall, 1978.

Larry McVoy

2:11 p.m.

On Thu, May 21, 2020 at 09:10:06PM -0700, John Gilmore wrote:

...

Note the quaint footnoted homage to distributed collaboration, which was still novel back then in the pre-Covid, pre-public-Internet, 2400 baud modem era. John

http://mcvoy.com/lm/papers/porting-berkeley.pdf for those who don't want to run it through groff. As an aside, this didn't work (firefox couldn't display it): groff -ms -Tpdf porting-berkeley.ms > porting-berkeley.pdf but this did: groff -ms porting-berkeley.ms > PS ps2pdf PS porting-berkeley.pdf I'll ask the groff people if they know what is up.

Richard Salz

2:34 p.m.

Great to hear from you John. I remember you handing out flyers during various Usenix meetings about this. :) One of my favorite parts of your paper: "the flat-tasting AT&T releases" !

Larry McVoy

2:17 p.m.

Clem, you should read that paper, link again: http://mcvoy.com/lm/papers/porting-berkeley.pdf because it validates a lot of what I have said about not having access to the AT&T code. The BSD code was slightly easier to get but even that, around 1985 at UW-Madison, was locked up on an 11/750 named slovax. I had to beg and beg to get a login on that machine. You had to be somebody to get access to the source and I was still nobody. I did get a login eventually, I think I had to sign some papers, don't remember. I went on to spend so many happy hours reading the sources that my primary machine, be it 68k, SPARC, MIPS, x86, whatever, has been called slovax ever since. On Thu, May 21, 2020 at 09:10:06PM -0700, John Gilmore wrote:

...

Richard Salz <rich.salz(a)gmail.com> wrote:

And what about John Gilmore making all bsd user it? And the multiple usenix tutorials?

-- --- Larry McVoy lm at mcvoy.com http://www.mcvoy.com/lm

arnold＠skeeve.com

7:42 a.m.

Richard Salz <rich.salz(a)gmail.com> wrote:

...

Was the fact that gcc had the "portable" RTL as an intermediate representation important? That it was designed to be ported.

I think it was. GCC had *two* intermediate forms, one representing the source program (trees), and the other representing instructions (RTL). It was really designed to make it easy to write both new front ends and new back ends. In that it seems to have succeeded fairly well, too. :-) Arnold

Greg A. Woods

11:50 p.m.

I always assumed C became popular because there was a very large cohort of programmers who started with it as their first language, usually on early Unix, at university, in the late very 1970s and early 1980s. After all if I was exposed to it a small Canadian university in the early 1980s, then surely it was almost everywhere! At least that's how it happened for me. I was already fluent in BASIC and reasonably good at Pascal before I went to university, and though we had a very wide variety of languages to work with since we had accounts on both Unix and Multics systems right from the start of first year, C was the strong favourite amongst both juniors and phds, i.e. all but the most die-hard Multics lovers (who of course used and loved PL/1, though by 1985 there was even talk of C on Multics). Some of this popularity of C was no doubt due to the fact that those a year or two ahead of me had started with FORTRAN on an IBM 370 and had absolutely hated it and were very vocal to those of us coming up behind that we were very lucky to jump right onto the Unix (and Multics) machines right from the start. My first job programming in 1983/84 was back to BASIC and assembler, but a year later and I was writing C again (though sadly mostly on MS-DOS, briefly on Xenix, then back to very early MS-Windows until about 1988 -- not long in hindsight, but it was painful). At Thu, 21 May 2020 12:10:35 -0400, Toby Thain <toby(a)telegraphics.com.au> wrote: Subject: Re: [TUHS] History of popularity of C

...

- inexpensive compiler availability was not very good until ~1990 or later, but C had been taking off like wildfire for 10 years before that

Well, there were a plethora of both full C and "tiny"/"small" C compilers widely available in the very early 1980s. Indeed I would say inexpensive C compilers were widely available and very popular well before 1985, and a few "toy/tiny" compilers were freely available by then too. By 1985 I was doing C development, primarily on MS-DOS systems, using commercial compilers, for a wide variety of projects, mostly in big national companies (in Canada, such as CP Rail). I would say C was the first commercially successful systems-level language available across many platforms, and that this was evidently so by 1985. Early Atari (6502) computers were partly programmed with a cross- compiler, though I've no idea what it was (possibly a re-targeted PCC). I think VisiCalc had similar origins. The most ground-breaking C compiler might arguably have been P.J.Plauger's Whitesmiths C compiler, around about 1978. I don't think it was what you'd call "inexpensive" necessarily, but it was popular. The BD Software company's C compiler for CP/M (8080/z80) was released in 1979. The first version of Mark Williams C came out very early, possibly before 1980. I owned a copy for MS-DOS 386 by 1985/86. This was the most Unix-like compiler and library, by far, and quite inexpensive (else I wouldn't have been able to afford my own personal copy). Small-C appeared in Dr.Dobb's in May 1980 (and it spawned a plethora of derivatives of its own). C was everywhere in personal computing literature by 1980. I believe Aztec C was first released in 1980. Two books about C were published by McGraw-Hill in 1982: "The C Primer", Les Hancock and Morris Krieger; and "The C Puzzle Book", Alan R. Feuer. There were likely more. Then there was Lattice C, out and about by 1982 and VERY popular and widely used by 1984. (I was using the second version in 1985/1986 on PCs. It's probably the buggiest compiler I've ever used for real work projects.) "Learning to Program in C" by Thomas Plum was published 1983. And of course there was Tanenbaum and Jacobs' ACK, with a C parser front-end in the early 1980s (even by 1980?). Brad Templeton wrote a C (or maybe Tiny-C) compiler for C64/6502 around about 1984 (though he only commercialized the "PAL" assembler I think). In my estimation GCC really only served to cement C's early success and popularity. It gave people certainty that a good C compiler would be available for most any platform no matter what happened. I would also argue that non-Unix C compilers actually drove the adoption curve of C. Pascal tried to play catch-up, but just as with what happened to me in university where it was one of the teaching languages, C was just far more popular and though Pascal had a tiny head-start (in terms of first-published books/manuals), C overtook it and had far more staying power too (though indeed in the late 1980s there was a fair battle going on in the pc/mac/amiga/etc world for Pascal). -- Greg A. Woods <gwoods(a)acm.org> Kelowna, BC +1 250 762-7675 RoboHack <woods(a)robohack.ca> Planix, Inc. <woods(a)planix.com> Avoncote Farms <woods(a)avoncote.ca>

Andy Kosela

23 May 23 May

7:28 a.m.

On 5/23/20, Greg A. Woods <gwoods(a)acm.org> wrote:

...

I would also argue that non-Unix C compilers actually drove the adoption curve of C. Pascal tried to play catch-up, but just as with what happened to me in university where it was one of the teaching languages, C was just far more popular and though Pascal had a tiny head-start (in terms of first-published books/manuals), C overtook it and had far more staying power too (though indeed in the late 1980s there was a fair battle going on in the pc/mac/amiga/etc world for Pascal).

This is my recollection as well. In the late 80s with the introduction of really nice compilers for MS-DOS like Turbo C from Borland (1987), Watcom C 6.0 (1988) and mature versions of Microsoft C (which originally was based on Lattice C), the C future was solidified. The documentation coming with those compilers were also excellent. I still have tons of reference books from that period. It was a time when almost everybody was using pure C. I think C++ needed another 5-7 years to displace C in the application market. --A

Clem Cole

5:08 p.m.

On Fri, May 22, 2020 at 7:51 PM Greg A. Woods <woods(a)robohack.ca> wrote:

...

Exactly - my giving away UNIX, it cemented the language and the technology into a group of young engineers (like me) who then 'spread the gospel' when we went to real jobs.

...

Well, there were a plethora of both full C and "tiny"/"small" C compilers widely available in the very early 1980s.

Yep -- I listed a little of the pre-history.

...

Indeed I would say inexpensive C compilers were widely available and very popular well before 1985, and a few "toy/tiny" compilers were freely available by then too.

Yup, although until the 386 and the DOS extenders, it could be tough to use with the Gordon's awful 'far pointer' infection.

...

Early Atari (6502) computers were partly programmed with a cross- compiler, though I've no idea what it was (possibly a re-targeted PCC).

Most 6502 shops were assembler, although you are correct cc65 shows up reasonably early. It was not PCC based.

...

I think VisiCalc had similar origins.

Dan Bricklin wrote it assembler. He had access to the same Harvard PDP-10 that Gates and Allen had used to write MITS Basic a few years earlier. I should ask him to be sure, but I was under the impression he used the SAIL based 6502 assembler I mentioned previously.[1]

...

The most ground-breaking C compiler might arguably have been P.J.Plauger's Whitesmiths C compiler, around about 1978. I don't think it was what you'd call "inexpensive" necessarily, but it was popular.

Other than his wretched 'anat' - a natural assembler, which was far from natural. But you are correct, particularly for non-UNIX boxes, he had the first 'widely used' compiler.

...

In my estimation GCC really only served to cement C's early success and popularity. It gave people certainty that a good C compiler would be available for most any platform no matter what happened.

I would agree. C had already been 'winning' by the time of gcc, and offering a compiler that was so portable and generated 'reasonable' code (sometimes even better than some of the commercial ones) I think was the winning score.

...

I would also argue that non-Unix C compilers actually drove the adoption curve of C.

I would put a small accent on that. I think the C compilers that targeted non-UNIX systems, and in particular the microprocessors were the driver. The micro's started with assembler in most cases. Basic shows up and is small, but it's not good enough for real products like VisiCalc or later Lotus. Pascal tries to be the answer, but I think it suffered from the fact that it makes Pascal a production quality language, you had a extend it and everybody's extensions were different. So, C came along and was 'better than assembler' and allowed 'production quality code' to be written, but with the exception of the far pointer stuff, pretty much worked as dmr had defined it for the PDP-11. So code could be written to work between compilers and systems. When the 386 DOS extenders show up, getting rid of far, and making it a 32-bit based language like the Vax and 68000, C had won. Clem 1.] FWIW: Bricklin I know socially. He was one of my brother's quad-mates at HBS in 1978-79 when he wrote VisiCalc to do his homework [the story is on the Wikipedia page]. In fact, there is now a plaque in the shared lounge over the nook where his study carrel was when he wrote it. The four of them all did pretty well. You know Dan's story, his roommate went on to found Staples, my brother's roommate became the CEO of Pepsi, and my brother ran Milcron, then founded a materials handling firm that did the automation for Amazon (and he sold the firm a few years ago to Honeywell). Also, their section-mate was Clay Christensen of the 'Innovators Dilemma' fame and of course classmate Meg Whitman would do eBay. Pretty impressive class from HBS.

Richard Salz

5:22 p.m.

Also around that time was Leor zolman BDS C compiler for MSDOS. used by Mark of the unicorn for their MINCE editor and Scribble word processor. Vince is not complete emacs, and you can figure out where scribble came from. I bought a motorcycle off Leor :-)

Derek Fawcus

6:42 p.m.

On Sat, May 23, 2020 at 01:08:28PM -0400, Clem Cole wrote:

...

So, C came along and was 'better than assembler' and allowed 'production quality code' to be written, but with the exception of the far pointer stuff, pretty much worked as dmr had defined it for the PDP-11. So code could be written to work between compilers and systems. When the 386 DOS extenders show up, getting rid of far, and making it a 32-bit based language like the Vax and 68000, C had won.

Certainly having a flat 32 bit compiler was eventually useful, but even prior to that the impact of 'far' pointers wasn't always an issue. For simple tasks, one simpy ignored it (wrote w/o 'far'), and the compiled as either small or large memory model. It was only if one wanted to optimise the code that 'far' became an issue, and a lot of code was never shipped, so didn't need to be so optimised. Even a lot of the shipped code I worked on with those DOS based compilers simply used large memory model, and ignored 'far'. More of an issue was the segmented memory, and that structures couldn't be larger than 64k. For targetting DOS, compilers eventually offered 'huge' pointers, and possibly a 'huge' memory model which hid the problem; but were of no use in protected 16 bit mode - which the embedded RT-OS I was developing for at the time used. DF

Michael Kjörling

7:28 p.m.

On 23 May 2020 13:08 -0400, from clemc(a)ccc.com (Clem Cole):

...

I would also argue that non-Unix C compilers actually drove the adoption curve of C.

There's also the issue that, even once you get into compiled BASIC territory, those wretched vendor-unique extensions show up again. Try porting, say, a non-trivial program written for QuickBASIC to Turbo BASIC even on the same PC. Both Pascal and BASIC are hard to extend by the programmer who's actually using them to try to write useful end-user software, _particularly_ in ways that fit into the rest of the code, so you're essentially stuck with what the compiler vendor thought you would need, or what they thought you would be willing to pay for, in memory or money. On the flip side, much of C's magic really isn't in the language (which is quite, pardon me, basic), but rather in the standard library. Yes, C('s standard library) ended up with its share of vendor-specific extensions as well, but the language itself actually gave the programmer the building blocks needed to, if necessary, even implement those extensions for a different compiler; most often without resorting to more than minimal amounts of assembler, and often outright none. So you weren't stuck with what the compiler vendor gave you; it was actually possible to effectively _extend_ the language vocabulary yourself, if you felt a need to do that. I didn't do serious enough programming back during those days for that to matter to me, but now that I get paid to write software, I definitely come across situations at times where the ability to extend the language in such a manner (and have the code using those extensions read idiomatically for the language) is awful nice. -- Michael Kjörling • https://michael.kjorling.se • michael(a)kjorling.se “Remember when, on the Internet, nobody cared that you were a dog?”

Dave Horsfall

26 May 26 May

4:21 a.m.

On Sat, 23 May 2020, Clem Cole wrote:

...

[...] Pascal tries to be the answer, but I think it suffered from the fact that it makes Pascal a production quality language, you had a extend it and everybody's extensions were different.

Perhaps I'm the only one here, but when I was taught Pascal (possibly by Dr. Lions himself) it was emphasised to us that it was not a production language bur a *teaching* language; you designed your algorithm, debugged it with the Pascal compiler, then hand-translated it into your favourite language (and debugged it again :-/). That damned "pre-fill read buffer" was always a swine with interactive sessions, though; I recall Andrew Hume threatening to insert a keyboard into the terminal's CRT if he saw that "?" prompt on the Cyber... -- Dave

Ed Carp

4:32 a.m.

"Perhaps I'm the only one here..." You're not. I was taught the same thing. It was never intended to be a production language.<div id="DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2"><br /> <table style="border-top: 1px solid #D3D4DE;"> <tr> <td style="width: 55px; padding-top: 13px;"><a href="https://www.avast.com/sig-email?utm_medium=email&utm_source=… target="_blank"><img src="https://ipmcdn.avast.com/images/icons/icon-envelope-tick-round-or… alt="" width="46" height="29" style="width: 46px; height: 29px;" /></a></td> <td style="width: 470px; padding-top: 12px; color: #41424e; font-size: 13px; font-family: Arial, Helvetica, sans-serif; line-height: 18px;">Virus-free. <a href="https://www.avast.com/sig-email?utm_medium=email&utm_source=… target="_blank" style="color: #4453ea;">www.avast.com</a> </td> </tr> </table><a href="#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2" width="1" height="1"></a></div>

Rob Pike

8:21 a.m.

The peculiar input semantics of Pascal are a consequence of a locally hacked-up version of NOS (I think that's the name) that ran on the big CDC machines at ETH in Zurich. It was entirely a card-based system then, and the way Pascal required read-ahead worked perfectly on that system, but not really on any other, including other card-based, even NOS systems. I was told this when I worked on that same machine as an exchange student working at EIR outside Zurich, but not by Wirth himself. I couldn't bring myself to ask him personally. -rob

Clem Cole

2:44 p.m.

On Tue, May 26, 2020 at 4:23 AM Rob Pike <robpike(a)gmail.com> wrote:

...

Yep, NOS was always a real mess. The ASCII vs 6-bit Display code got mixed up in this too, IIRC. But again, if you think of Pascal as a teaching language under a batch system, where the student tosses in her/his program and some data to run against it. The batch queue eventually picks up your 'job', tries to compile the code, and if successful will run the executable it once on your input deck - a small light comes on. Yeah it does that just fine and it is a pretty simple model. BTW: a number of those local NOS hacks were to make the system easier to use with student batch files. I think it was Ward Cunningham that told me in the late 1970s, ETH got some of those NOS hacks from Purdue - Ward had been working in the Purdue computer center and he sent the CDC tape to them (remember Purdue was late to the Arpanet and I do not ETH was one of the few places in Europe that had connections). Sending mag tapes via mail or maybe FedEx/DHL was pretty standard in those days. Particularly within Universities, shops with the same hardware and/or OS tended to share a lot of tricks and solutions to issues. FWIW: that particular 6500 from Purdue is now at the LCM+L in Seattle.

Clem Cole

2:32 p.m.

On Tue, May 26, 2020 at 12:22 AM Dave Horsfall <dave(a)horsfall.org> wrote:

...

On Sat, 23 May 2020, Clem Cole wrote:

[...] Pascal tries to be the answer, but I think it suffered from the fact that it makes Pascal a production quality language, you had a extend it and everybody's extensions were different.

Perhaps I'm the only one here, but when I was taught Pascal (possibly by Dr. Lions himself) it was emphasised to us that it was not a production language bur a *teaching* language; you designed your algorithm, debugged it with the Pascal compiler, then hand-translated it into your favourite language (and debugged it again :-/). Dave that was exactly my point. Pascal was designed as a teaching

language so Wirth did not put things into the language that made it helpful as a production language. So everyone else tried and the language became a mess. Everybody peed on it. Dennis' quote: “When I read commentary about suggestions for where C should go, I often think back and give thanks that it wasn't developed under the advice of a worldwide crowd.” <https://www.inspiringquotes.us/quotes/eDQR_hqwtHAC9> It's not that you could not turn Pascal into a production language, but every attempt to try to do so was done in a different manner. And within firms it was always different. Eight different 'Tek Pascal' implementations -- all close, but different - he says shaking his head.

Greg A. Woods

7:50 p.m.

At Tue, 26 May 2020 10:32:43 -0400, Clem Cole <clemc(a)ccc.com> wrote: Subject: Re: [TUHS] History of popularity of C

...

Dave that was exactly my point. Pascal was designed as a teaching language so Wirth did not put things into the language that made it helpful as a production language. So everyone else tried and the language became a mess. Everybody peed on it. Dennis' quote: “When I read commentary about suggestions for where C should go, I often think back and give thanks that it wasn't developed under the advice of a worldwide crowd.” <https://www.inspiringquotes.us/quotes/eDQR_hqwtHAC9>

And that's exactly what's wrong with C now -- except it's probably even a bit worse for C as the majority of people who have been sitting on the C standards committees for the past decades are primarily either those with deeply funded agendas about how they think they can make more money with the language if only it behaves a certain way (e.g. more like C++); and/or a few academic compiler and optimizer experts who have strong ideas about how they can eek the tiniest gains from their compilers if only the spec says certain things. UB (undefined behaviour), for example, should be stricken from the standard completely and forever. Every behaviour MUST be defined, either by the implementation (with NO recourse for or fallback to UB), or, strictly defined, by the standard. -- Greg A. Woods <gwoods(a)acm.org> Kelowna, BC +1 250 762-7675 RoboHack <woods(a)robohack.ca> Planix, Inc. <woods(a)planix.com> Avoncote Farms <woods(a)avoncote.ca>

Thomas Paulsen

9:48 p.m.

...

they don't play any role, as the C language was defined decades ago. I learned it before the ansi committee came to an end by Turbo C and soon later MS C, and then various *NIX compilers. Recently I written a couple of linux programs using gcc with exactly the same syntax I studied 30 years ago, and it works pretty cool. All these programs are error free performing very fast while having a small memory footprint. For me there is nothing better than C, and I know a lot of languages.

Greg A. Woods

10:36 p.m.

At Tue, 26 May 2020 23:48:43 +0200, "Thomas Paulsen" <thomas.paulsen(a)firemail.de> wrote: Subject: Re: [TUHS] History of popularity of C

...

You might be surprised by just how much C has been changed since, say, C89, or even C90, and how niggly the corner cases can get (i.e. where UB sticks its ugly head). Lots of legacy code is now completely broken, at least with the very latest compilers (especially LLVM, but also GCC). Some far more recently written code has even had important security problems, e.g. one in the Linux kernel. NetBSD has to turn off specific "features" in the newest compilers when building the kernel lest they create a broken and/or insecure system. Some code no longer does what it seems to do unless you're the most careful language lawyer at reading it, Standard in hand, and with years of experience. Some compilers can help, e.g. by inserting illegal instructions anywhere where UB would have otherwise allowed the optimizer to go wild and possibly change things completely, but without such tools, and others such as Valgrind, one can get into a heap-o-trouble with the slightest misstep; and of course these tools only work for user-land code, not bare-metal code such as embedded systems and kernels. -- Greg A. Woods <gwoods(a)acm.org> Kelowna, BC +1 250 762-7675 RoboHack <woods(a)robohack.ca> Planix, Inc. <woods(a)planix.com> Avoncote Farms <woods(a)avoncote.ca>

Ronald Natalie

27 May 27 May

2:37 p.m.

The large areas of undefined and unspecified behavior has always been an issue in C. It was somewhat acceptable when you were using it as a direct replacement for assembler, but Java and many of other follow-ons endevaored to be more portable/rigourous. Of course, you can write crap code in any language. It didn’t take modern C to do this. On the PDP-11 (at least not in split I/D mode), location zero for example contained a few assembler instructions (p&P6) which you could print out. Split I/D and VAX implementations made this even worse by putting a 0 at location 0. When we moved from the VAX to other processors we had location zero unmapped. For the first time, accessing a null pointer ended up trapping rather than either resulting in a null (or some random data). Eventually, we added a feature to the kernel called “Braindamanged Vax compatibility Mode” that restored the zero to location zero. This was enabled by a field we could poke into the a.out header because this was needed on things we didn’t have source code to (things we did we just fixed). Similar nonsense we found where the order that function args are evaluated was relied upon. The PDP-11, etc… evaluated them right-to-left because that’s how they had to push them on the stack for the call linkage. We had one machine that did that in the opposite order (I considered flipping the compiler behavior anyhow0 and when we got to the RISC architectures, things were passed in registered so the evaluation was less predictable. I already detailed the unportability problem I found where the BSD kernel “converted by union”. The most amusing thing I’d have to say was that one day I got a knock on my office door. One of the sales guys from our sister company wanted to know if I could write some Novell drivers for an encrypting ethernet card they were selling. The documentation for writing the driver was quite detailed but all describing i386 assembler interfaces (and the examples were in assembler). About a week into the project I came to realization that the linkages were all the C subroutine calls for that platform. The caller was C and there was no particular reason why the driver wasn’t also written in C.

Clem Cole

3:09 p.m.

Henry Spencer's 10 Commandments for C Programmers <https://www.seebs.net/c/10com.html> On Wed, May 27, 2020 at 10:38 AM Ronald Natalie <ron(a)ronnatalie.com> wrote:

...

Thomas Paulsen

4:11 p.m.

...

One cannot compare system and business related stuff! When I'm doing C I always have the CPU and its instructions in mind. As Linus I see the assembly code in my inner eyes. For such minds, doing with C what earlier was done with assembly, C was created, whereas writing business applications cobol and its modern relative java are the first choices.

Greg A. Woods

7:49 p.m.

At Wed, 27 May 2020 18:11:33 +0200, "Thomas Paulsen" <thomas.paulsen(a)firemail.de> wrote: Subject: Re: [TUHS] History of popularity of C

...

When I'm doing C I always have the CPU and its instructions in mind.

And that's exactly what might trip you up unless you _exactly_ understand how the language standard defines the operations of the abstract virtual machine (right down to the implications of every sequence point in the code); how compilers and optimizers do and (more importantly) do not work when mapping the abstract virtual machine operations into real-world machine instructions; and what how _all_ instances of "undefined behaviour" can arise, and exactly what the optimizer is allowed to do when and if it spots UB conditions in the code. A big part of the problem is that the C Standard mandates compilation will and must succeed (and allows this success to be totally silent too) even if the code contains instances of undefined behaviour. This means that the successful execution of the generated code may depend on what optimization level was chosen. Code that does security tests on input values might be entirely and silently eliminated by the optimizer because of some innocuous-seeming UB instance, and this is exactly what has happened in the Linux kernel, for example (probably more than once). UB can be introduced quite innocently just by moving sequence points in variable references in ways that are not necessarily obvious even to seasoned programmers (and indeed "seasoned" programmers are often the ones who's old-fashioned coding habits might lead to introduction of serious problems in such a way). I've found dozens of instances of UB in mature and well tested code, and sometimes only by luck of having chosen the "right" compiler and enabled its feature of introducing illegal instructions in places where UB might occur, _and_ having had the luck to test in such a way as to encounter the specific code path where this UB occurred. I would claim it's truly safer now to write C without understanding the underlying mechanics of the CPU and memory, but rather by just paying very close attention to the detailed semantics of the language, understanding only the abstract virtual C machine, and hoping your compiler will at least warn if anything even remotely suspicious is done in your code; and lastly (but perhaps most importantly) avoiding like the plague any coding constructs which might make UB harder to spot (e.g. never ever initialize local variables with their definition when pointers are involved). Unfortunately the new "most advanced" C compilers also make it quite a bit more difficult for those of us writing C code that must have specific actions on the bare metal hardware, e.g. in embedded systems, kernels, hardware drivers, etc.; including especially where UB detection tools are far more difficult to use. -- Greg A. Woods <gwoods(a)acm.org> Kelowna, BC +1 250 762-7675 RoboHack <woods(a)robohack.ca> Planix, Inc. <woods(a)planix.com> Avoncote Farms <woods(a)avoncote.ca>

Larry McVoy

8:13 p.m.

So I may have just gotten lucky in my 30+ years of writing C code but I have yet to hit a single instance of this doom and gloom. On Wed, May 27, 2020 at 12:49:25PM -0700, Greg A. Woods wrote:

...

At Wed, 27 May 2020 18:11:33 +0200, "Thomas Paulsen" <thomas.paulsen(a)firemail.de> wrote: Subject: Re: [TUHS] History of popularity of C

When I'm doing C I always have the CPU and its instructions in mind.

-- --- Larry McVoy lm at mcvoy.com http://www.mcvoy.com/lm

Richard Salz

8:23 p.m.

Places where I've seen it, crypto code in OpenSSL. Trying to zero-ize key material, the compiler sees that "char key[]" isn't used any more, and optimizes-away the memcmp. Trying to do constant-time math.

Nevin Liber

9 p.m.

On Wed, May 27, 2020 at 2:50 PM Greg A. Woods <woods(a)robohack.ca> wrote:

...

A big part of the problem is that the C Standard mandates compilation will and must succeed (and allows this success to be totally silent too) even if the code contains instances of undefined behaviour.

No it does not. To quote C11: undefined behavior behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). Much UB cannot be detected at compile time. Much UB is too expensive to detect at run time. Take strlen(const char* s) for example. s must be a valid pointer that points to a '\0'-terminated string. How would you detect that at compile time? How would you set up your run time to detect that and error out? How would you design your codegen and runtime to detect and error out when UB is invoked in this code: #include <stdio.h> #include <string.h> void A(const char* a, const char* b) { printf("%zu %zu\n", strlen(a), strlen(b)); } // Separate compilation unit int main() { const char a[] = {'A'}; const char b[] = {'\0'}; A(a, b); } -- Nevin ":-)" Liber <mailto:nl <nevin@eviloverlord.com>iber@gmail.com> +1-847-691-1404

Greg A. Woods

11:17 p.m.

At Wed, 27 May 2020 16:00:57 -0500, Nevin Liber <nliber(a)gmail.com> wrote: Subject: Re: [TUHS] History of popularity of C

...

On Wed, May 27, 2020 at 2:50 PM Greg A. Woods <woods(a)robohack.ca> wrote:

No it does not. To quote C11: undefined behavior behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements

Sorry, I concede. Yes, "no requirements". In C99 at least. Sadly most compilers, including GCC and Clang/LLVM will, at best, warn (and warnings are only treated as errors by the most macho|wise); and compilers only do that now because they've been getting flack from developers whenever the optimizer does something unexpected.

...

Much UB cannot be detected at compile time. Much UB is too expensive to detect at run time.

Indeed. At best you can get a warning, or optional runtime code to abort the program. Now this isn't a problem when "undefined behaviour" becomes "implementation defined behaviour" for a given implementation. However that's not portable obviously, except for the trivial cases where the common compilers for a given type of platform all do the same things. The real problems though arise when the optimizer takes advantage of these rules regardless of what the un-optimized code will do on any given platform and architecture. The Linux kernel example I've referred to involved dereferencing a pointer to do an assignment in a local variable definition, then a few lines later testing if the pointer was NULL before using the local variable. Unoptimised the code will dereference a NULL pointer and load junk from location zero into the variable (because it's kernel code), then the NULL test will trigger and all will be good. The optimizer rips out the NULL check because "obviously" the programmer has assumed the pointer is always a valid non-NULL pointer since they've explicitly dereferenced it before checking it and they wouldn't want to waste even a single jump-on-zero instruction checking it again. (It's also quite possible the code was written "correctly" at first, then someone mushed all the variable initialisations up onto their definitions.) In any case there's now a GCC option: -fno-delete-null-pointer-checks (to go along with -fno-strict-aliasing and -fno-strict-overflow, and -fno-strict-enums, all of which MUST be used, and sometimes -fno-strict-volatile-bitfields too, on all legacy code that you don't want to break) It's even worse when you have to write bare-metal code that must explictly dereference a NULL pointer (a not-so-real example: you want to use location zero in the CPU zero-page (e.g. on a 6502 or 6800, or PDP-8, etc.) as a pointer) -- it is now impossible to do that in strict Standard C even though trivially it "should just work" despite the silly rules. As far as I can tell it always did just work in "plain old" C. The crazy thing about modern optimizers is that they're way more persistent and often somewhat more clever than your average programmer. They follow all the paths. They apply all the rules at every turn.

...

Take strlen(const char* s) for example. s must be a valid pointer that points to a '\0'-terminated string. How would you detect that at compile time? How would you set up your run time to detect that and error out?

My premise is that you shouldn't try to detect this problem, AND in any case where the optimizer might be able to prove the pointed at object isn't a valid string it should not, and must not, abuse that knowledge to rip out code or cause other even worse mis-behaviour. I.e. this should not be "undefined", but rather "implementation defined and without any recourse to allowing optimizer abuses". -- Greg A. Woods <gwoods(a)acm.org> Kelowna, BC +1 250 762-7675 RoboHack <woods(a)robohack.ca> Planix, Inc. <woods(a)planix.com> Avoncote Farms <woods(a)avoncote.ca>

Dave Horsfall

5 Jun 5 Jun

8:57 p.m.

On Wed, 27 May 2020, Greg A. Woods wrote:

...

Sadly most compilers, including GCC and Clang/LLVM will, at best, warn (and warnings are only treated as errors by the most macho|wise); and compilers only do that now because they've been getting flack from developers whenever the optimizer does something unexpected.

Don't talk to me about optimisers... That's not the code that I wrote! I've seen code simply disappear, because the "optimiser" though that it was cleverer than I was.

...

The Linux kernel example I've referred to involved dereferencing a pointer to do an assignment in a local variable definition, then a few lines later testing if the pointer was NULL before using the local variable. Unoptimised the code will dereference a NULL pointer and load junk from location zero into the variable (because it's kernel code), then the NULL test will trigger and all will be good. The optimizer rips out the NULL check because "obviously" the programmer has assumed the pointer is always a valid non-NULL pointer since they've explicitly dereferenced it before checking it and they wouldn't want to waste even a single jump-on-zero instruction checking it again. (It's also quite possible the code was written "correctly" at first, then someone mushed all the variable initialisations up onto their definitions.)

Typical Penguin/OS behaviour...

...

In any case there's now a GCC option: -fno-delete-null-pointer-checks (to go along with -fno-strict-aliasing and -fno-strict-overflow, and -fno-strict-enums, all of which MUST be used, and sometimes -fno-strict-volatile-bitfields too, on all legacy code that you don't want to break)

I'm sure that there's a competition somewhere, to see who can come with GCC's -fmost-longest-and-most-obscure-option flags...

...

It's even worse when you have to write bare-metal code that must explictly dereference a NULL pointer (a not-so-real example: you want to use location zero in the CPU zero-page (e.g. on a 6502 or 6800, or PDP-8, etc.) as a pointer) -- it is now impossible to do that in strict Standard C even though trivially it "should just work" despite the silly rules. As far as I can tell it always did just work in "plain old" C.

I've programmed a PDP-8! 'Twas way back in high school, and I found a bug in my mentor's program; it controlled traffic lights...

...

The crazy thing about modern optimizers is that they're way more persistent and often somewhat more clever than your average programmer. They follow all the paths. They apply all the rules at every turn.

Optimisers... Grrr... -- Dave

Nemo Nusquam

9:40 p.m.

On 06/05/20 16:57, Dave Horsfall wrote (in part):

...

[...] Optimisers... Grrr...

Steve Johnson's position paper on optimising compilers may amuse you: https://dl.acm.org/doi/abs/10.1145/567532.567542 N.

...

-- Dave

Richard Salz

9:47 p.m.

...

I'm sure that there's a competition somewhere, to see who can come with GCC's -fmost-longest-and-most-obscure-option flags...

At least one of the GCC maintainers is German, so possibly. Can clang keep up? :)

Bakul Shah

10:01 p.m.

On Jun 5, 2020, at 2:47 PM, Richard Salz <rich.salz(a)gmail.com> wrote:

...

| I'm sure that there's a competition somewhere, to see who can come with | GCC's -fmost-longest-and-most-obscure-option flags... At least one of the GCC maintainers is German, so possibly. Can clang keep up? :)

Clang has more than kept up! clang: -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang<value> gcc-9: -print-sysroot-headers-suffix Not counting gcc's --help={common|optimizers|params|target|warnings|[^]{joined|separate|undocumented}}[,...].

Ed Carp

6 Jun 6 Jun

8:49 p.m.

On 5/27/20, Ronald Natalie <ron(a)ronnatalie.com> wrote:

...

"It's not a bug, it's a feature" C was written when the programmer had to be more rigorous instead of just letting things slide and having the language do their thinking for them. I remember being laughed at for using static arrays instead of malloc() and friends, until people found out that safety-critical systems were written the same way. I have C code that was written 35 years ago that's still in production. Back then, you had to be careful, and you actually had to think about what you were writing. We've gotten soft and lazy, and now we're paying for it.

Thomas Paulsen

9:08 p.m.

'C was written when the programmer had to be more rigorous instead of just letting things slide and having the language do their thinking for them. ' I fully subscribe to that. Today the company owners have to pay a lot for programmers having the language do their thinking for them. The memory hunger of the soa java business services of the company I worked prior to retirement, is sheer endless. Arnold once told that there is more demand for C developers in Israel. I envy you

Larry McVoy

9:13 p.m.

I did one stint at a Java shop, Charles Schwab's web group. No talent, no architecture, no vision. Lots of politics and back stabbing and claiming credit for other people's work. Totally toxic, hands down the worst job I've ever had. I lasted less than 6 months and am surprised I made it that far. On Sat, Jun 06, 2020 at 11:08:43PM +0200, Thomas Paulsen wrote:

...

-- --- Larry McVoy lm at mcvoy.com http://www.mcvoy.com/lm

Ed Carp

10:27 p.m.

"Arnold once told that there is more demand for C developers in Israel. I envy you" Maybe I ought to move to Israel. Sounds like they have more common sense there.<div id="DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2"><br /> <table style="border-top: 1px solid #D3D4DE;"> <tr> <td style="width: 55px; padding-top: 13px;"><a href="https://www.avast.com/sig-email?utm_medium=email&utm_source=… target="_blank"><img src="https://ipmcdn.avast.com/images/icons/icon-envelope-tick-round-or… alt="" width="46" height="29" style="width: 46px; height: 29px;" /></a></td> <td style="width: 470px; padding-top: 12px; color: #41424e; font-size: 13px; font-family: Arial, Helvetica, sans-serif; line-height: 18px;">Virus-free. <a href="https://www.avast.com/sig-email?utm_medium=email&utm_source=… target="_blank" style="color: #4453ea;">www.avast.com</a> </td> </tr> </table><a href="#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2" width="1" height="1"></a></div> On 6/6/20, Thomas Paulsen <thomas.paulsen(a)firemail.de> wrote:

...

Tyler Adams

11:14 p.m.

...

"C was written when the programmer had to be more rigorous instead of just letting things slide and having the language do their thinking for them"

True, but a wise man once said "let the computer do the dirty work". Tyler On Sun, Jun 7, 2020 at 1:28 AM Ed Carp <erc(a)pobox.com> wrote:

...

"Arnold once told that there is more demand for C developers in Israel. I envy you" Maybe I ought to move to Israel. Sounds like they have more common sense there.<div id="DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2"><br /> <table style="border-top: 1px solid #D3D4DE;"> <tr> <td style="width: 55px; padding-top: 13px;"><a href=" https://www.avast.com/sig-email?utm_medium=email&utm_source=link&ut… " target="_blank"><img src=" https://ipmcdn.avast.com/images/icons/icon-envelope-tick-round-orange-anima… " alt="" width="46" height="29" style="width: 46px; height: 29px;" /></a></td> <td style="width: 470px; padding-top: 12px; color: #41424e; font-size: 13px; font-family: Arial, Helvetica, sans-serif; line-height: 18px;">Virus-free. <a href=" https://www.avast.com/sig-email?utm_medium=email&utm_source=link&ut… " target="_blank" style="color: #4453ea;">www.avast.com</a> </td> </tr> </table><a href="#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2" width="1" height="1"></a></div> On 6/6/20, Thomas Paulsen <thomas.paulsen(a)firemail.de> wrote:

company

I worked prior to retirement, is sheer endless. Arnold once told that there is more demand

for

C developers in Israel. I envy you

arnold＠skeeve.com

7 Jun 7 Jun

5:57 a.m.

Ed Carp <erc(a)pobox.com> wrote:

...

"Arnold once told that there is more demand for C developers in Israel. I envy you"

The market in Israel for software developers is VERY hot. Based entirely on the emails I get from Linked-In about jobs that may interest me, there's some C, but a lot more C++, both Windows and Linux. Also a lot of Python.

...

Maybe I ought to move to Israel.

Moving here isn't a trivial decision, especially if you don't speak any Hebrew. Off-topic. Sorry.

...

Sounds like they have more common sense there.

We do, but the strong influence of western (US) culture is eroding it, which is saddening and frustrating. This definitely off-topic. Arnold

Andy Kosela

9:22 a.m.

On 6/7/20, arnold(a)skeeve.com <arnold(a)skeeve.com> wrote:

...

Ed Carp <erc(a)pobox.com> wrote:

"Arnold once told that there is more demand for C developers in Israel. I envy you"

Seriously, is anyone still doing any real development in C besides kernel programming and embedded world?? Maybe I was living under a rock, but I always had an impression that the industry moved to C++ in the late 90s and stayed with it ever since. The last bastion of C was open source Linux/*BSD programming but I remember the time when C was a truly universal programming language used for _everything_ including games (e.g. Doom). Maybe I just miss the 90s. --Andy

Ed Carp

9:39 a.m.

On 6/7/20, Andy Kosela <akosela(a)andykosela.com> wrote:

...

Absolutely. C++ isn't the panacea that it's made out to be, it's not "superior" to C++, it's just C with other useful stuff bolted on, but it didn't make C obsolete, not by a long shot. Some find the operator overloading and classes to be a lot more confusing than just function calls and such, without adding a lot to the language itself. Others, of course, feel differently. It's more of a religious discussion than anything, like debating the merits of vi vs. emacs - a pointless discussion, since you're not going to change anyone else's mind anyway. Lots of C still being written out there. My most recent project was writing a crypto library in C. Writing it was the hard part - validating it was something else entirely. One of the nicest things about C++ is that you can write your code entirely in C and the C++ compiler will compile it, no problem.

Brantley Coile

10:02 a.m.

This might register low on the useful information index, but I decided a few years back as VC Coraid was coming apart and New Coraid was being resurrected under the auspices of SouthSuite Inc., that I would have a C mono-culture and use the language for everything. Specifically no JavaScript and no Python. Our website is written in C. Webpages generated on the server side using C. Our evolving ERP is written in C. Our subscription system is written in C. Our test systems are written in it. All work quite well. The focus on C was for three main reasons. First, our products are infrastructure products and are meant to be simple, fast, and affordable. Forty-two years of experience plus C gives us the ability to squeeze all the performance possible from any hardware platform. We "see" the instructions our code will generate. C is a great choice for that. (As would have been Oberon, but that's another discussion.) Second, if instead of having a set of complex languages, each with its own adherents, using a single language removes all the distracting and divisive language wars having multiple complex languages create. Little language like AWK and the shell script are fine. It's the more complex ones that divide people. Lastly, a single, powerful, simple (on the other side of complexity) language that a single person can maintain is essential to our Software Atelier model of doing business. Like the workshops of the renaissance, we have to understand and work on all our own tools. I use Ken's C compiler under Plan 9. It weights in at a light, 20K lines of code. As I said, I'm not sure how useful this data point is for you. Over the last thirty years I carefully chose my foot falls through the software swamp to avoid getting sucked under by one of the quagmires of complexity. Brantley

...

On Jun 7, 2020, at 5:22 AM, Andy Kosela <akosela(a)andykosela.com> wrote: On 6/7/20, arnold(a)skeeve.com <arnold(a)skeeve.com> wrote:

Ed Carp <erc(a)pobox.com> wrote:

"Arnold once told that there is more demand for C developers in Israel. I envy you"

Thomas Paulsen

11:30 a.m.

...

I repeat myself: C never was intended for writing business applications. It was made for OS development in the sense of a portable assembler replacement. It also has found itself as the main devel language for fundamental services like db engines etc.

Clem Cole

3:26 p.m.

On Sun, Jun 7, 2020 at 5:23 AM Andy Kosela <akosela(a)andykosela.com> wrote:

...

Hardly, in my 45+ years, I have seen way more C projects than C++. The C projects tended to last longer, have a more profound impact and many are still being developed. I've been at Intel for the last qtr of my career and without a doubt, C is the #1 language in use internally, with C++ probably #2. Most of our work is actually in user space, although obviously we do a great deal of low-level work. The Intel compilers are a mix of both languages. That said, as we move to more LLVM work (we actually have the largest number of paid LLVM developers ). For your amusement check out: https://www.archer.ac.uk/status/codes/ Archer is a large HPC site in the UK. Many supercomputer centers off similar stats, but I often point to Archer because it easy to get an understanding of what programming languages are used for codes that are actually used in day-to-day production. That said, I personally am the most excited about Go theses day, but I'm also thinking Rust looks pretty interesting, but my experience with both compared to C is extremely nominal. Neither language is used for anything in production in our world at this point. Clem

Larry McVoy

3:52 p.m.

On Sun, Jun 07, 2020 at 11:26:45AM -0400, Clem Cole wrote:

...

That said, I personally am the most excited about Go theses day, but I'm also thinking Rust looks pretty interesting, but my experience with both compared to C is extremely nominal. Neither language is used for anything in production in our world at this point.

If I had to move to a modern language it would be Go. I looked at Rust and barfed. But if I had the juice, I'd just evolve C. Make a C+ that isn't object oriented but does have some of the stuff that modern languages have. I'd pretty much take http://little-lang.org and evolve C to that.

Adam Thornton

8 Jun 8 Jun

1:02 a.m.

...

On Jun 7, 2020, at 8:52 AM, Larry McVoy <lm(a)mcvoy.com> wrote: On Sun, Jun 07, 2020 at 11:26:45AM -0400, Clem Cole wrote:

If I had to move to a modern language it would be Go. I looked at Rust and barfed.

Several years ago, this was a job talk I gave, based on my experience at the time developing a pretty nifty system that never found traction. The new job (I got it!) doesn’t use Go, so I’ve grudgingly gone back to Python. But I stand by most of what I wrote (although I am sure parts of it are outdated and wrong now). The tl;dr is the title of the talk: https://athornton.github.io/go-it-mostly-doesnt-suck I make the claim that Go *is* pretty much C with 35 years of lessons learned about what did and didn’t work in C, and 35 years of machine time getting cheaper and programmer time getting more expensive. Adam

Thomas Paulsen

8:04 a.m.

...

I make the claim that Go *is* pretty much C with 35 years of lessons learned about what did and didn’t work in C, and 35 years of machine time getting cheaper and programmer time getting more expensive.

see Rob Pike 'Notes on Programming in C' freely available in the internet at: https://www.lysator.liu.se/c/pikestyle.html

Bakul Shah

7 Jun 7 Jun

5:26 p.m.

On Jun 7, 2020, at 8:26 AM, Clem Cole <clemc(a)ccc.com> wrote:

...

People who use Rust seem to really like it but so far I have not done anything in it. I will take another look if I do any bare metal coding. In contrast I use Go for almost all my own coding now. I am not entirely happy with it but the ecosystem around it is great. And the really nice thing about it is that not only I can cross-compile programs but also, I believe it is the only compiled language (much more so than C) where a lot of stuff just works on on plan9.

Bakul Shah

5:35 p.m.

On Jun 7, 2020, at 8:26 AM, Clem Cole <clemc(a)ccc.com> wrote:

...

Nemo Nusquam

6:50 p.m.

On 06/07/20 11:26, Clem Cole wrote (in part):

...

They seem to be used in some worlds: https://blog.golang.org/10years and https://www.rust-lang.org/production N.

Chris Torek

9:15 p.m.

Yes, both Rust and Go are being used. Rust has some advantages: like C++, it can compile to very fast code that does not need a garbage collector but that provides type-safety. It also gives you thread-safety through clever compiler analysis of who "owns" any given variable or data. The ownership / lifetime-analysis / borrow-checker is quite complicated and takes a lot of getting-used-to. I have not written anything "real" in Rust and had not had time to really learn it (I was hoping to learn it for real and write code in it the last few years, but that never actually happened). Meanwhile, Go is actually a really nice language to use. It has a few quirks, but it gives you reasonably-fast-running code that (because there is a garbage collector) does not require nearly as much skull-sweat when figuring out who owns memory and how to make sure it gets released appropriately. Its built in channels and goroutine support makes multi-threaded cod, and using all the CPUs effectively, also much easier. You pay (sometimes noticeably) for the GC, but the price is not too bad in less time-critical situations. The GC has a few short stop-the-world points but since Go 1.6 or so, it's pretty smooth, unlike what I remember from 1980s Lisp systems. :-) (Note: I started with Go 1.11, so I don't have a lot of history, nor that much experience in it. But I do like it.) Both Go and Rust have build systems. In Rust, this is a separate front-end from the compiler proper (rustc): you run "cargo build", for instance. In Go, you run "go build", "go test", etc., to build and invoke unit tests and so on. Rust has generics (think C++ templates, except sane) and Go lacks them although there's a plan for them in Go 2.0. Rust has, or at least had, rather bad array support when I last looked at it: you could make an array out of any type, but with only up to 32 elements. Chris

Dan Cross

10:16 p.m.

On Sun, Jun 7, 2020 at 5:37 PM Chris Torek <torek(a)torek.net> wrote:

...

Yes, both Rust and Go are being used.

Indeed. The languages complement each other quite nicely. Disclaimer: I've been programming in Rust for my day job for almost two years now and I sit in the office next to the Go tech lead and in close proximity a number of people working on Go. Rust has some advantages: like C++, it can compile to very fast

...

code that does not need a garbage collector but that provides type-safety. It also gives you thread-safety through clever compiler analysis of who "owns" any given variable or data.

Safe Rust code is data race free, but not free of race conditions, let alone thread safe in all ways. The ownership / lifetime-analysis / borrow-checker is quite

...

complicated and takes a lot of getting-used-to.

Once you get used to it, it doesn't feel that complex but yes, it takes a while to get used to. When we switched our project to it (from C++, at my behest) our tech lead quipped that Rust has a "near vertical" learning curve. Suffice it to say we're all quite glad we switched but there was an adjustment period. Interestingly, we find it a selling point for attracting new engineers regardless of the challenges in coming up to speed. [snip] Rust has, or at least had, rather bad

...

array support when I last looked at it: you could make an array out of any type, but with only up to 32 elements.

I'm afraid this is incorrect. Rust arrays are indexed by a `usize`, which is basically whatever `size_t` would be in C. Rust arrays in general can be essentially arbitrarily large (up to limitations imposed by the target machine, of course). However, Rust does not support dependent types, most certainly not for arrays. In other words, an array's size is considered part of its type and so when specializing traits on arrays, one must do so explicitly for each supported array size. For practical reasons such implementations are often limited to a relatively small number of distinct sizes; 32 is a believable number. Perhaps that's what you're thinking of? For example, the `Vec<T>` type was recently modified to implement the `From` trait on arrays, but only on arrays up to size 32. This means that one can essentially create a dynamic, growable vector seeded from an array, but such source arrays are limited in how large they can be. My current biggest frustrations are in assumptions made by the memory allocator, the magical nature of `box` (the allocation primitive), and the under-defined memory model. - Dan C.

Chris Torek

10:56 p.m.

...

Safe Rust code is data race free, but not free of race conditions, let alone thread safe in all ways.

Er, yes. I mainly wanted to contrast to Go, where you look for race conditions by building with a flag that enables runtime checking. This can only detect races that actually occur, and if there are paths that would still have races that didn't occur on your test run, well...

...

>[rust array limitations]

...

However, Rust does not support dependent types, most certainly not for arrays. In other words, an array's size is considered part of its type and so when specializing traits on arrays, one must do so explicitly for each supported array size. For practical reasons such implementations are often limited to a relatively small number of distinct sizes; 32 is a believable number. Perhaps that's what you're thinking of?

Ah, yes, that was it. (As I said, I never really had time to do anything "real" in Rust.) Go's slices are nice to use, but slices (and maps, for that matter) trip people up because they are headers that point to shared spaces. Chris

Warren Toomey

11:14 p.m.

New subject: Comparative languages

All, as the discussion on C has moved on to other languages, we might move the followups to COFF. Thanks, Warren

Bram Wyllie

8 Jun 8 Jun

12:24 a.m.

Dependent types aren't needed for sum types though, which is what you'd normally use for an array that carries its size, correct? On Sun, Jun 7, 2020 at 7:57 PM Chris Torek <torek(a)torek.net> wrote:

...

Safe Rust code is data race free, but not free of race conditions, let alone thread safe in all ways.

>[rust array limitations]

Lars Brinkhoff

5:48 a.m.

Chris Torek wrote:

...

You pay (sometimes noticeably) for the GC, but the price is not too bad in less time-critical situations. The GC has a few short stop-the-world points but since Go 1.6 or so, it's pretty smooth, unlike what I remember from 1980s Lisp systems. :-)

I'm guessing those 1980s Lisp systems would also be pretty smooth on 2020s hardware.

Bakul Shah

6 Jun 6 Jun

11:31 p.m.

On Jun 6, 2020, at 1:49 PM, Ed Carp <erc(a)pobox.com> wrote:

...

On 5/27/20, Ronald Natalie <ron(a)ronnatalie.com> wrote:

"It's not a bug, it's a feature"

A snippet of a recent comp.arch post by someone (the subject was C and safety): What you call "misfeatures", some other people call "features". If you expect people to take you and your opinions seriously, you'll get on better if you stop mocking other opinions. I've written several times why undefined behaviour lets me write better and safer code, as well as more efficient code. If you remain determinedly unconvinced, at least agree to disagree without sounding childish about it.

Greg A. Woods

7 Jun 7 Jun

12:12 a.m.

At Sat, 6 Jun 2020 16:31:42 -0700, Bakul Shah <bakul(a)iitbombay.org> wrote: Subject: Re: [TUHS] History of popularity of C

...

On Jun 6, 2020, at 1:49 PM, Ed Carp <erc(a)pobox.com> wrote:

On 5/27/20, Ronald Natalie <ron(a)ronnatalie.com> wrote:

"It's not a bug, it's a feature"

Heh. W.r.t. efficiency, well undefined behaviour does allow the compiler to turn their code, or anyone's else's code, into more "efficient" code if they happen to (accidentally or otherwise) trip over undefined behaviour. However I don't think it can be argued in any valid way that "undefined behaviour" can ever lead to "better and safer" code, in any way, or from any viewing angle, whatsoever. "Undefined behaviour" just means that the language definition is somehow adversely compromised in such a way that it is impossible to prevent the programmer from writing compilable and executable code that will always produce some well defined behaviour in all standards-compliant implementations. I.e. the language allows that there are ways to write syntactically correct code that cannot be guaranteed to do anything particular whatsoever in _all_ standards-compatible implementations. We can argue until the cows come home whether "undefined behaviour" is a "necessary" part of the language definition (e.g. to keep the language implementable, or backward-compatible, or whatever), but I don't see how any valid argument can ever be made for it being a "good" and "useful" thing from the perspective of a programmer using the language. Undefined behaviours are black holes for which the language standard offers no real guidance nor maps for safe passage other than the stern warning to avoid them as best as possible. Perhaps it is such scare-mongering that the author above justifies as their influence to write "better and safer" code, but that's no good argument for having such pits of despair in the language definition in the first place. If we were arguing theology then I would say the bible we call the "C Standard" is actually actively trying to trap its followers into committing sins. Luckily the real world of C is made of actual implementations, and they are free to either offer definitions for how various (ab)uses of the language will work, or to maintain the black holes of mystery that we must try to avoid, or even sometimes to give us the choice in how they will treat our code. As programmers we should try to choose which implementation(s) we use, and how we control _their_ behaviour, while at the same time still doing our best to avoid giving them the rope to hang us with. -- Greg A. Woods <gwoods(a)acm.org> Kelowna, BC +1 250 762-7675 RoboHack <woods(a)robohack.ca> Planix, Inc. <woods(a)planix.com> Avoncote Farms <woods(a)avoncote.ca>

emanuel stiebler

11:04 a.m.

On 2020-06-06 16:49, Ed Carp wrote:

...

C was written when the programmer had to be more rigorous instead of just letting things slide and having the language do their thinking for them. I remember being laughed at for using static arrays instead of malloc() and friends, until people found out that safety-critical systems were written the same way.

Still avoiding malloc and friends in safety critical systems, for a good reason ...

Thomas Paulsen

11:33 a.m.

...

Still avoiding malloc and friends in safety critical systems, for a good reason ...

all modern languages allowing dynamical memory allocation one way or another.

Toby Thain

26 May 26 May

3:19 p.m.

On 2020-05-26 12:21 AM, Dave Horsfall wrote:

...

On Sat, 23 May 2020, Clem Cole wrote:

[...] Pascal tries to be the answer, but I think it suffered from the fact that it makes Pascal a production quality language, you had a extend it and everybody's extensions were different.

Prof. Knuth came up with an interesting solution to that -- in the process, inventing (or maturing) the concept of "literate programming". Perhaps it's not well known that his most widely used programs (e.g. TeX) were written in something VERY close to standard Pascal (preprocessing aside). The translation to C (as required by certain platforms) was mechanical. --Toby

...

That damned "pre-fill read buffer" was always a swine with interactive sessions, though; I recall Andrew Hume threatening to insert a keyboard into the terminal's CRT if he saw that "?" prompt on the Cyber... -- Dave

Thomas Paulsen

4 p.m.

...

Dr. Lions himself) it was emphasised to us that it was not a production language bur a *teaching* language;

In the early 90ths I written some larger programs in Turbo Pascal after years of intensively working with my favored C&C++ language, and was surprised how well designed the Borland language was. Thus, recently I installed Free-Pascal with its comfortable IDE and since then I'm wondering why they always inventing new languages as these 'old' C&Pascal languages are so well designed and implemented, that I can't imagine that anything else is really needed.

Christopher Browne

4:21 p.m.

On Tue, 26 May 2020 at 12:01, Thomas Paulsen <thomas.paulsen(a)firemail.de> wrote:

...

Dr. Lions himself) it was emphasised to us that it was not a production language bur a *teaching* language;

I remember the fighting going on at that time. I did some Pascal in about 1986, with one of the Waterloo compilers, and found it mildly a pain in the neck; it was a reasonably-nearly-strict version of the academic language, and was painful for non-academic programming for the reasons normally thrown about. In grad school, I TA'ed a course that was using TurboPascal, and it was definitely a reasonable extension towards usability for larger programs that needed more sophisticated environmental interactions. The compiler was decently fast (unlike Ada, anyone??? ;-) ), and the makers were selective and adequately opinionated as to their extensions. And I fully recall the split ongoing, as academic folk would regard TurboPascal as "non-conformant" with the standard, whilst bwk's missive on "Why Pascal Is Not My Favorite Language" provides a good explanation... And bwk nicely observed, "Because the language is so impotent, it must be extended. But each group extends Pascal in its own direction, to make it look like whatever language they really want." The Modula family seemed like the better direction; those were still Pascal-ish, but had nice intentional extensions so that they were not nearly so "impotent." I recall it being quite popular, once upon a time, to write code in Modula-2, and run it through a translator to mechanically transform it into a compatible subset of Ada for those that needed DOD compatibility. The Modula-2 compilers were wildly smaller and faster for getting the code working, you'd only run the M2A part once in a while (probably overnight!) -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?"

Thomas Paulsen

7:29 p.m.

Dan Cross

7:55 p.m.

Cc: to COFF, as this isn't so Unix-y anymore. On Tue, May 26, 2020 at 12:22 PM Christopher Browne <cbbrowne(a)gmail.com> wrote:

...

[snip] The Modula family seemed like the better direction; those were still Pascal-ish, but had nice intentional extensions so that they were not nearly so "impotent." I recall it being quite popular, once upon a time, to write code in Modula-2, and run it through a translator to mechanically transform it into a compatible subset of Ada for those that needed DOD compatibility. The Modula-2 compilers were wildly smaller and faster for getting the code working, you'd only run the M2A part once in a while (probably overnight!)

Wirth's languages (and books!!) are quite nice, and it always surprised and kind of saddened me that Oberon didn't catch on more. Of course Pascal was designed specifically for teaching. I learned it in high school (at the time, it was the language used for the US "AP Computer Science" course), but I was coming from C (with a little FORTRAN sprinkled in) and found it generally annoying; I missed Modula-2, but I thought Oberon was really slick. The default interface (which inspired Plan 9's 'acme') had this neat graphical sorting simulation: one could select different algorithms and vertical bars of varying height were sorted into ascending order to form a rough triangle; one could clearly see the inefficiency of e.g. Bubble sort vs Heapsort. I seem to recall there was a way to set up the (ordinarily randomized) initial conditions to trigger worst-case behavior for quick. I have a vague memory of showing it off in my high school CS class. - Dan C.

Jon Steinhart

8 p.m.

Dan Cross writes:

...

Of course Pascal was designed specifically for teaching. I learned it in high school ...

I had a different experience; I learned C in high school at BTL and then took my first programming class in college which was Pascal and I kept finding it extremely difficult to use because it was so much less flexible than C. Until I took that class it had never even occurred to me that people would write books about the topic as I had leaned from technical memoranda. There were two books in this class, Wirth's and Fundamental Algorithms. Got Don to sign my copy a few years ago which he said he wouldn't do unless it looked really used. Jon

Jim Capp

21 May 21 May

4:18 p.m.

Again, based on recollections, what got me immediately interested was that I regarded C as a "portable assembler". It was one of the earliest implementations of "write-once-run-anywhere". From: "Tyler Adams" <coppero1237(a)gmail.com> To: "The Eunuchs Hysterical Society" <tuhs(a)tuhs.org> Sent: Thursday, May 21, 2020 11:27:26 AM Subject: [TUHS] History of popularity of C Does anybody have any good resources on the history of the popularity of C? I'm looking for data to resolve a claim that C is so prolific and influential because it's so easy to write a C compiler. Tyler

A. P. Garcia

6:58 p.m.

...

From memory, there is a History of Programming Languages book from an ACM

conference that contains some papers that were presented there, along with some notes from Q&A sessions that followed. I'm paraphrasing, but Dennis Ritchie said something flattering about Pascal, that it was essentially the same language as C. Given this, asked Niklaus Wirth, why do you suppose that C is so much more popular than Pascal? Ritchie answered, "I don't know". My personal opinion is that Ken Thompson is not given enough credit for the beauty and expressiveness of C, as much of this comes from its predecessor, B, which is essentially Thompson's "remix" of BCPL. On Thu, May 21, 2020, 11:28 AM Tyler Adams <coppero1237(a)gmail.com> wrote:

...

Clem Cole

7:02 p.m.

On Thu, May 21, 2020 at 11:28 AM Tyler Adams <coppero1237(a)gmail.com> wrote:

...

Does anybody have any good resources on the history of the popularity of C? I'm looking for data to resolve a claim that C is so prolific and influential because it's so easy to write a C compiler.

Hmmmm, I don't know what's been written, but old Dr. Dobbs and Byte Mag are where I would start from 1975-85 (which I have in my attic, but lack a good index). Let me give you my experience and recollections, although Larry may scream that catalog of memories he is colored by what he likes to call the UNIX club. Academics in the mid-late 70s all got UNIX with full sources to originally the Ritchie C Compiler and later the Johnson compiler. Plus had access to yacc/lex and the first editions of the dragon book. Before C (or B) shows up there already were 'system programming languages' such as BCPL, BLISS, PL/360, and ESPOL (much leads full languages like Fortran, Algol family, PL/1) which people were also trying to use for systems work. In '73, C had been retargeted for the PDP-10 by Alan Snyder @ MIT https://github.com/PDP-10/Snyder-C-compiler, but I believe that was a rewrite not a port of the Ritchie compiler. I believe this was the first retarget, at least outside of the MH. Before that C has been retargeted to the Honeywell, Interdata and I believe the S/360 -- Steve and Doug can probably say more. The 8-bit microprocessor arrives on the scene in 75 and the 16-bit ones 4 years later. Many of us at different universities wrote assemblers and linkers for the same, often in C under UNIX. CMU had a SAIL based 6502 assembler in the CS Dept that was used to burn ROMs, but it ran on the PDP-10. There must have been an 8080 assembler over there too, but I don't remember it. The 10s were more difficult to use in the EE building and tough to use with the KIM-1s. I wrote one for the 6502, 8085, and the Z80 for the EE department on our 11/34 UNIX V6 system. Ted Kowalski wrote the predecessor to the eventual UNIX cu(1) program, which we called connect(1) that allowed us to download code to the KIM's (and other micros) from the UNIX systems that we had in the EE lab (which he took back to USG, was rewritten and went into both PWB and eventually TS and V7). The Purdue 8-bit micro suite would eventually become popular because it supported full relocation and linker, which microprocessor support tools like the one I wrote did not. There was group at Purdue in EE that started to retarget the Ritchie C Compiler, but I've lost track what happen. Mike Zuhl or Ward Cummingham might remember what became of that (more in a minute) - I'm pretty sure Ward was mixed up in the that -- check his web site you might find stuff there, or we can ask either of them (Steinhart might know some of it too, as we all working with the original Microprocessor team in Tek Walker Road in the late 1970s). The first microprocessor targeted C compiler I personally used was the one from Teletype Corporation which had retargeted the Ritchie C Compiler to the Z80 in 1977/78 IIRC (that Phil Karn brought to CMU). He and I hacked it to use my assembler and got it to spit out 8085 code for our semester project for Steely Dan's Real-Time course. This was the original C compiler he used for the KA9Q TCP/IP, although at some point he switched Leor Zohlman's Brain-Damaged Systems (BDS) compiler ( https://en.wikipedia.org/wiki/BDS_C) after we both left CMU. In the late 1970s ('78 I think), Dennis Allison was teaching a course at Stanford. The assignment was to developed TinyBasic (for the 6502 IIRC). Some of these got presented at an early AMW (talk to Bob Berg if you want try to find the date). This idea spread to a lot of places and the idea of 'TinyX' or SmallX was started. By the late 1979/early 1980, Ron Cain (one of his students I believe) used an SRI based UNIX system to develop his 'Small C' that he would publish the sources to in Byte and eventually a book that was used to teach (which I still have): https://en.wikipedia.org/wiki/Small-C. The Small C compiler would get retargeted to the other 8-bit micros and you can find most of them with a search engine. The best I can tell, Leor and Ron worked independently of each other. Leor's compiler was a tad more complete and he actually wrote a UNIX clone for the Z80 with it (I don't remember if Leor has fp support, Ron did not). Leor had access to the Ritchie compiler, but he seems to have written it himself (you can search for and download the sources and decide yourself). Leor showed many of us his systems running on 3 8" floppies at the Boston USENIX in the early 1980s [I remember dmr playing with it and remarked how much it reminded him of early UNIX on the PDP-11]. Also, after I left CMU in 1979, I took the Ritchie compiler and retargeted to what would become the 68000 (it was not yet released or numbered when I started). Paul Blanter of Tek labs wrote the assembler and Steve Glaser and I hacked v7's ld a little. This was the original tool suite for the Magnolia system. The folks in the MIT RTS Group had started to retarget the Johnson compilers to the 8086, the 68000 and eventually the Z8000 as part of the NU project and Trix (I know Jack Test, who had previously been at Stanford had is hand in this -- tjt wrote the MIT 68000 assembler that used an MIT hacked version of V7's old, I think John Siber did the C8086). Around the same time, CMU started the Mach project and created the macho format. Robert Baron and Mike Accetta were heavily involved, but I think they took the MIT compilers as the basis for some of that work. At some point (Steve can fill us in) I thought someone in USG started to retarget his compilers for USG. This is the source of the AT&T assembler and is what ISC started with when they did the 386 ports a few years later for AT&T that Heinz talked about a few weeks ago. Meanwhile, Gordon Letwin who had been Purdue, EE, brought the Purdue assemblers and forked from the C compiler work at some point. He and Bob Greenberg did the start of the compilers for original Xenix work for the 8086, we would have to ask Bob or Gordon for more details [Gordon is believed to be the source the terrible curse, called the 'far' pointer]. By the early 1980s, a number of UNIX ports start and many C Compilers show up. I think the John Bass did the Onyx Z8000 C compiler independent of the MIT code base, but the MIT NU C compilers and the NU UNIX port would become used by a lot of the 'JAWS' work that would start to ramp in the early 1980s. Anyway -- the point is we all had access to the UNIX sources (sorry Larry) we start to hack on them. Plus different Universities doing compiler work, like Andy Tannenbaum release compilers (ACK) independent of the AT&T code origin but built/bootstrapped from UNIX/the UNIX toolkit. Waterloo, Edinburgh, and others also all put something out. Plus you start to commercial C implementations like Intermetrics, Tartan Labs, Greenhills (in fact IIRC the Apple Mac C Compiler was developed under contract by Greenhills). What I am leaving out is the BASIC and Pascal wars that were going on at the same time. The 8-bits micros, in particular, went BASIC crazy. The 'CS types' at many Universities (like mine at CMU) had been considered BASIC, C, and Fortran as 'ugly' and were using an Algol or a more Algol-like language as the future (Pascal was premier teaching language at the time). For issues, we can talk about in COFF, Pascal diverged (in 1980 at one of the Hatfield and McCoy parties at Steve Glaser's, a couple of us counted 14 incompatible 'HP-BASIC's and 8 different 'Tek Pascal' in use). Here comes the final thing that happened... By the early, mid-80s, all us UNIX folks were happy using UNIX derived C compiler, like the NU suite. But as Larry points out, there was a whole group of people that could not get UNIX sources or tools. Stallman sets out to build his Gnu system and he needs a language and compiler. I've always been amazed he did not use LISP, other than the first tool he wanted was EMACS, and get got CMU (Gosling's) codebase to start. CMU-EMACS was in C (plus his 'mock-lisp' creation). So rms needed a C compiler and starts to hack mock-lisp to be more to his taste. But to make it widespread he needs a C compiler and microprocessor tools. So he starts to write his famous compiler -- which to me is that key thing he did. The Gnu project would release tools that ran pretty much anywhere and targeted the popular micro's and generated 'good enough code.' Cole's Law -- 'Simple Economics always beats Sophisticated Architecture.' It turns out Paul Winalski and I were just talking about this last week. I very much believe C 'won' the war for economic reasons. UNIX being 'Open Source' in the 70s to the Unversity types, did allow us to hack and >>share<< the compilers, either Ritchie or Johnson based. Moore's Law caused the 16-bit micros to flourish and they ended up in systems. Unix taught a number of programmers the language and the tool suite, then we went to the real world and wanted it. Stallman's tools were there. It did not matter that there were 'better' languages (Pascal had forked, we also had new languages from OCAM to Modula, eventually C++ et al). The Gnu C compiler was cheap (free) and that was the final stroke. Clem

1850

days inactive

1868

days old

tuhs@tuhs.org

Manage subscription

83 comments

35 participants

tags (0)

participants (35)

A. P. Garcia
Adam Thornton
Andy Kosela
arnold＠skeeve.com
Bakul Shah
Bram Wyllie
Brantley Coile
CHARLES KESTER
Chris Torek
Christopher Browne
Clem Cole
Dan Cross
Dave Horsfall
David Arnold
Derek Fawcus
Ed Carp
emanuel stiebler
Greg A. Woods
Jim Capp
John Foust
John Gilmore
Jon Steinhart
Larry McVoy
Lars Brinkhoff
Michael Kjörling
Nemo Nusquam
Nevin Liber
Richard Salz
Rob Pike
Ronald Natalie
Thomas Paulsen
Toby Thain
Tony Finch
Tyler Adams
Warren Toomey